GitXplorerGitXplorer
m

eBPF-Guide

public
560 stars
52 forks
0 issues

Commits

List of commits on branch main.
Verified
abc9295bdd2497947a38a888ccccb2008f03a382

Updated Maintenance.

mmikeroyal committed a year ago
Verified
9916c244f365cff038cf714413337a6876901cfe

Merge pull request #15 from mikeroyal/mikeroyal-patch-6

mmikeroyal committed a year ago
Verified
b1f2b34d0ed923015cb2b59faedefdca9b1baa7f

Fixed typo.

mmikeroyal committed a year ago
Verified
4c5adb1ef71dbe1a5ca31ca6e8e7272c2692f4a2

Merge pull request #14 from mikeroyal/mikeroyal-patch-6

mmikeroyal committed a year ago
Verified
b22d000bd40a0a584334b05e9d0b0f319ff8aaeb

Added more eBPF tools & frameworks for monitoring.

mmikeroyal committed a year ago
Verified
1f95ca3e639f40e42883bd35731058b9146c424d

Updated Contributing.md

mmikeroyal committed 2 years ago

README

The README file for this repository.


eBPF Guide

followers

Maintenance Last-Commit

A guide covering eBPF including the applications, libraries and tools that will make you a better and more efficient eBPF development.

Note: You can easily convert this markdown file to a PDF in VSCode using this handy extension Markdown PDF.


Table of Contents

  1. Getting Started with eBPF

  2. LLVM Development

  3. GO Development

  4. C++ Development

  5. Rust Development

  6. Networking

  7. Kubernetes

  8. Databases

Getting Started with eBPF

Back to the Top

eBPF Companies

  • Isovalent is a company founded by the creators of Cilium and eBPF. They build open source software and enterprise solutions solving networking, security, and observability needs for modern cloud native infrastructure.

eBPF Tools & Libraries

eBPF is a technology that can run sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules. By making the Linux kernel programmable, infrastructure software can leverage existing layers, making them more intelligent and feature-rich without continuing to add additional layers of complexity to the system.


eBPF Architecture Overview. Credit: eBPF.io

eBPF for Windows is an eBPF implementation that runs on top of Windows. eBPF is a well-known technology for providing programmability and agility, especially for extending an OS kernel, for use cases such as DoS protection and observability. Cilium L4 Load Balancer using eBPF-for-Windows


eBPF for Windows Architecture Overview. Credit: Microsoft

XDP(eXpress Data Path) is a technology that allows developers to attach eBPF programs to low-level hooks, implemented by network device drivers in the Linux kernel(since version 4.8), as well as generic hooks that run after the device driver. XDP can be used to achieve high-performance packet processing in an eBPF architecture, primarily using kernel bypass.

AF_XDP is an address family that is optimized for high performance packet processing.

BPF Compiler Collection (BCC) is a toolkit for creating efficient kernel tracing and manipulation programs, and includes several useful tools and examples. It makes use of extended BPF (Berkeley Packet Filters), formally known as eBPF, a new feature that was first added to Linux 3.15. Though, much of what BCC uses requires Linux 4.1 and above.


BCC performance tools. Credit: Brendan Gregg

Bpftrace is a high-level tracing language for Linux eBPF. Its language is inspired by awk and C, and predecessor tracers such as DTrace and SystemTap. bpftrace uses LLVM as a backend to compile scripts to eBPF bytecode and makes use of BCC as a library for interacting with the Linux eBPF subsystem as well as existing Linux tracing capabilities and attachment points.

Cilium is an open source project that provides eBPF-powered networking, security and observability. It has been specifically designed from the ground up to bring the advantages of eBPF to the world of Kubernetes and to address the new scalability, security and visibility requirements of container workloads.

Falco is a behavioral activity monitor designed to detect anomalous activity in applications. Falco audits a system at the Linux kernel layer with the help of eBPF. It enriches gathered data with other input streams such as container runtime metrics and Kubernetes metrics, and allows to continuously monitor and detect container, application, host, and network activity.

Katran is a C++ library and eBPF program to build a high-performance layer 4 load balancing forwarding plane. Katran leverages the XDP infrastructure from the Linux kernel to provide an in-kernel facility for fast packet processing. Its performance scales linearly with the number of NIC's receive queues and it uses RSS friendly encapsulation for forwarding to L7 load balancers.

Hubble is a fully distributed networking and security observability platform for cloud native workloads. It is built on top of Cilium and eBPF to enable deep visibility into the communication and behavior of services as well as the networking infrastructure in a completely transparent manner.

Pixie is an open-source observability tool for Kubernetes applications. It's used to view the high-level state of your cluster (service maps, cluster resources, application traffic) and also drill down into more detailed views (pod state, flame graphs, individual full-body application requests).

BumbleBee is a tool that helps to build, run and distribute eBPF programs using OCI images. It allows you to focus on writing eBPF code, while taking care of the user space components that automatically exposing your data as metrics or logs.

Sysmon for Linux is a tool that monitors and logs system activity including process lifetime, network connections, file system writes, and more. Sysmon works across reboots and uses advanced filtering to help identify malicious activity as well as how intruders and malware operate on your network.

KubeArmor is a cloud-native runtime security enforcement system that restricts the behavior (such as process execution, file access, and networking operations) of pods, containers, and nodes (VMs) at the system level. It leverages Linux security modules (LSMs) such as AppArmor, SELinux, or BPF-LSM to enforce the user-specified policies.

Caretta is a lightweight, standalone tool that instantly creates a visual network map of the services running in your cluster. It leverages eBPF to efficiently map all service network interactions in a K8s cluster, and Grafana to query and visualize the collected data.

dae is a Linux high-performance transparent proxy solution based on eBPF.

eunomia-bpf is a compiler and runtime framework to help you Build, Distribute and Run CO-RE eBPF programs easier with JSON and Webassembly OCI images.

Kindling is an eBPF-based cloud native monitoring tool, which aims to help users understand the app behavior from kernel to code stack. With trace profiling, we hope the user can understand the application's behavior easily and find the root cause in seconds. It also provides an easy way to get an overview of network flows in the Kubernetes environment, and many built-in network monitor dashboards like TCP retransmit, DNS, throughput, and TPS.

Odigos is a tool for Distributed tracing without code changes. It instantly monitor any application using OpenTelemetry and eBPF.

SSHLog is a Linux daemon written in C++ and Python that passively monitors OpenSSH servers via eBPF. It's configurable, any combination of features may be enabled, disabled, or customized. It works with your existing OpenSSH server process, no alternative SSH daemon is required. Simply install the sshlog package to begin monitoring SSH.

L3AFD is the primary component of the L3AF control plane. It's a daemon that orchestrates and manages multiple eBPF programs. L3AFD runs on each node where the user wishes to run eBPF programs. L3AFD reads configuration data and manages the execution and monitoring of eBPF programs running on the node.

Wachy is a tool that provides a UI for interactive eBPF-based userspace performance debugging.

Merbridge is a tool that uses eBPF to speed up your Service Mesh like crossing an Einstein-Rosen Bridge.

DeepFlow is a highly automated observability platform for cloud-native developers. Using new technologies such as eBPF, WASM, and OpenTelemetry, DeepFlow innovatively implements core mechanisms such as AutoTracing, AutoMetrics, AutoTagging, and SmartEncoding, which greatly avoids code instrumentation and significantly reduces the resource overhead of back-end data warehouses.

Parca is a tool for continuous profiling for analysis of CPU and memory usage, down to the line number and throughout time. Saving infrastructure cost, improving performance, and increasing reliability.

loxilb is a tool that provides service type external load-balancer for K8s using eBPF as its core engine. It powers Edge|5G|IoT|XaaS Apps.

kube-loxilb is loxilb's implementation of kubernetes service load-balancer spec which includes support for load-balancer class, IPAM (shared or exclusive) etc.

loxi-ccm is a tool that provides an implementation of kubernetes load-balancer spec but it runs as a part of cloud-provider and provides load-balancer life-cycle management as part of it.

loxicmd is the command-line tool for loxilb. It is equivalent of "kubectl" for loxilb.

Kubectl-trace is a kubectl plugin that allows for scheduling the execution of bpftrace(8) programs in Kubernetes clusters. kubectl-trace does not require installation of any components directly onto a Kubernetes cluster in order to execute bpftrace programs. When pointed to a cluster, it schedules a temporary job called trace-runner that executes bpftrace.

Ply is a dynamic tracer for Linux which is built upon eBPF. It has been designed with embedded systems in mind, is written in C and all that ply needs to run is libc and a modern Linux kernel with eBPF support, meaning, it does not depend on LLVM for its program generation. It has a C-like syntax for writing scripts and is heavily inspired by awk(1) and dtrace(1).

Tracee is a Runtime Security and forensics tool for Linux. It is using Linux eBPF technology to trace your system and applications at runtime, and analyze collected events to detect suspicious behavioral patterns.

bpfcov is a source-code based coverage for eBPF programs actually running in the Linux kernel.

eCapture is a tool that captures SSL/TLS text content without CA cert using eBPF.

Tetragon is a eBPF-based Security Observability and Runtime Enforcement.

SkyWalking is an open source APM system, including monitoring, tracing, diagnosing capabilities for distributed system in Cloud Native architecture.

Skydive is an open source real-time network topology and protocols analyzer. It aims to provide a comprehensive way of understanding what is happening in the network infrastructure.

The Linux kernel contains the eBPF runtime required to run eBPF programs. It implements the bpf(2) system call for interacting with programs, maps, BTF and various attachment points where eBPF programs can be executed from. The kernel contains a eBPF verifier in order to check programs for safety and a JIT compiler to translate programs to native machine code. User space tooling such as bpftool and libbpf are also maintained as part of the upstream kernel.

Landlock LSM(Linux Security Module) is a framework to create scoped access-control (sandboxing). Landlock is designed to be usable by unprivileged processes while following the system security policy enforced by other access control mechanisms (DAC, LSM, etc.).

LLVM compiler infrastructure contains the eBPF backend required to translate programs written in a C-like syntax to eBPF instructions. LLVM generates eBPF ELF files which contain program code, map descriptions, relocation information and BTF meta data. These ELF files contain all necessary information for eBPF loaders such as libbpf to prepare and load programs into the Linux kernel.

Gobpf is a Go-based library which provides Go bindings for the BCC framework as well as low-level routines to load and use eBPF programs from ELF files.

rbpf is a Rust virtual machine and JIT compiler for eBPF programs.

Libbpfgo is a Go wrapper around libbpf. It supports BPF CO-RE and its goal is to be a complete implementation of libbpf APIs. It uses CGo to call into linked versions of libbpf.

Libbpf is a C/C++ based library which is maintained as part of the upstream Linux kernel. It contains an eBPF loader which takes over processing LLVM generated eBPF ELF files for loading into the kernel. libbpf received a major boost in capabilities and sophistication and closed many existing gaps with BCC as a library. It also supports important features not available in BCC such as global variables and BPF skeletons.

Libbpf-rs is a safe, idiomatic, and opinionated wrapper API around libbpf written in Rust. libbpf-rs, together with libbpf-cargo (libbpf cargo plugin) allows to write 'compile once run everywhere' (CO-RE) eBPF programs.

Redbpf is a Rust eBPF toolchain that contains a collection of Rust libraries to work with BPF/eBPF programs.

redcanary-ebpf-sensor - A set of BPF programs that gather security relevant event data from the Linux kernel. The BPF programs are combined into a single ELF file from which individual probes can be selectively loaded, depending on the running operating system and kernel version.

bpflock - Lock Linux machines is an eBPF driven security tool for locking and auditing Linux machines.

coroot-node-agent is an eBPF based Prometheus exporter that gathers comprehensive container metrics such as container-to-container TCP connections, network latency, CPU delay accounting, log summaries, cloud instance metadata, etc.

Kernel-collector is a Linux Kernel eBPF Collectors developed by Netdata.

socket-connect-bpf is a BPF/eBPF Linux command line utility that writes human-readable information about each application that makes new (network) connections to the standard output.

Polycube is an eBPF/XDP-based software framework for fast network services(such as bridges, routers, firewalls, and others) running in the Linux kernel. Polycube services, called cubes, can be composed to build arbitrary service chains and provide custom network connectivity to namespaces, containers, virtual machines, and physical hosts.

Books & Tutorials

Back to the Top

LLVM Development

Back to the Top


LLVM Learning Resources

LLVM is a library that has collection of modular/reusable compiler and toolchain components (assemblers, compilers, and debuggers). With these components LLVM can be used as a compiler framework, providing a front-end(parser and lexer) and a back-end (code that converts LLVM's representation to actual machine code).

Clang is a language front-end and tooling infrastructure for languages in the C language family (C, C++, Objective C/C++, OpenCL, CUDA, and RenderScript) for the LLVM project.

LLVM Project GitHub

LLVM Documentation

LLVM Discussion Forum

LLVM | Apple Developer Forums

Contributing to LLVM

Getting Started with LLVM

Getting Started with Clang

How To Setup Clang Tooling For LLVM

Using Clang-Tidy in Visual Studio

Configure VS Code for Clang/LLVM on macOS

LLVM Tools, Libraries and Frameworks

Visual Studio Code is a code editor redefined and optimized for building and debugging modern web and cloud applications.

Code Server is a tool that allows you to run VS Code on any machine anywhere and access it in the browser.

Clang-Format is a tool to format C/C++/Java/JavaScript/Objective-C/Objective-C++/Protobuf code.

Clang-Tidy is a clang-based C++ "linter" tool. Its purpose is to provide an extensible framework for diagnosing and fixing typical programming errors, like style violations, interface misuse, or bugs that can be deduced via static analysis. clang-tidy is modular and provides a convenient interface for writing new checks.

Clangd is a Visual Studio Code extension that provides C/C++ language IDE features for VS Code using clangd.

LLD is a linker from the LLVM project that is a drop-in replacement for system linkers and runs much faster than them. It also provides features that are useful for toolchain developers. The linker supports ELF (Unix), PE/COFF (Windows), Mach-O (macOS) and WebAssembly in descending order.

TinyGo is a Go compiler(based on LLVM) intended for use in small places such as microcontrollers, WebAssembly (Wasm), and command-line tools.

FileCheck is a flexible pattern matching file verifier.

tblgen is a description to C++ Code.

clang-tblgen is a description to C++ Code for Clang.

lldb-tblgen is a description to C++ Code for LLDB.

llvm-tblgen is a target description to C++ Code for LLVM.

mlir-tblgen is a description to C++ Code for MLIR.

lit is a LLVM Integrated Tester.

llvm-exegesis is a LLVM Machine Instruction Benchmark.

llvm-locstats is a calculate statistics on DWARF debug location.

llvm-pdbutil is a PDB File forensics and diagnostics.

llvm-profgen is a LLVM SPGO profile generation tool

bugpoint is a automatic test case reduction tool.

llvm-extract is a extract a function from an LLVM module.

llvm-bcanalyzer is a LLVM bitcode analyzer.

llvm-addr2line is a drop-in replacement for addr2line.

llvm-ar is a LLVM archiver.

llvm-cxxfilt is a LLVM symbol name demangler.

llvm-install-name-tool is a LLVM tool for manipulating install-names and rpaths.

llvm-nm is a list LLVM bitcode and object file’s symbol table.

llvm-objcopy is a object copying and editing tool.

llvm-objdump is a LLVM’s object file dumper.

llvm-ranlib is a generates an archive index.

llvm-readelf is a GNU-style LLVM Object Reader.

llvm-size is a print size information.

llvm-strings is a print strings.

llvm-strip is a object stripping tool.

GO Development

Back to the Top


Go Learning Resources

Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.

Golang Contribution Guide

Google Developers Training

Google Developers Certification

Uber's Go Style Guide

GitLab's Go standards and style guidelines

Effective Go

Go: The Complete Developer's Guide (Golang) on Udemy

Getting Started with Go on Coursera

Programming with Google Go on Coursera

Learning Go Fundamentals on Pluralsight

Learning Go on Codecademy

Go Tools and Frameworks

golang tools holds the source for various packages and tools that support the Go programming language.

Go in Visual Studio Code is an extension that gives you language features like IntelliSense, code navigation, symbol search, bracket matching, snippets, and many more that will help you in Golang development.

Traefik is a modern HTTP reverse proxy and load balancer that makes deploying microservices easy. Traefik integrates with your existing infrastructure components (Docker, Swarm mode, Kubernetes, Marathon, Consul, Etcd, Rancher, Amazon ECS, etc.) and configures itself automatically and dynamically. Pointing Traefik at your orchestrator should be the only configuration step you need.

Gitea is Git with a cup of tea, painless self-hosted git service. Using Go, this can be done with an independent binary distribution across all platforms which Go supports, including Linux, macOS, and Windows on x86, amd64, ARM and PowerPC architectures.

OpenFaaS is Serverless Functions Made Simple. It makes it easy for developers to deploy event-driven functions and microservices to Kubernetes without repetitive, boiler-plate coding. Package your code or an existing binary in a Docker image to get a highly scalable endpoint with auto-scaling and metrics.

micro is a terminal-based text editor that aims to be easy to use and intuitive, while also taking advantage of the capabilities of modern terminals. As its name indicates, micro aims to be somewhat of a successor to the nano editor by being easy to install and use. It strives to be enjoyable as a full-time editor for people who prefer to work in a terminal, or those who regularly edit files over SSH.

Gravitational Teleport is a modern security gateway for remotely accessing into Clusters of Linux servers via SSH or SSH-over-HTTPS in a browser or Kubernetes clusters.

NATS is a simple, secure and performant communications system for digital systems, services and devices. NATS is part of the Cloud Native Computing Foundation (CNCF). NATS has over 30 client language implementations, and its server can run on-premise, in the cloud, at the edge, and even on a Raspberry Pi. NATS can secure and simplify design and operation of modern distributed systems.

Act is a GO program that allows you to run our GitHub Actions locally.

Fiber is an Express inspired web framework built on top of Fasthttp, the fastest HTTP engine for Go. Designed to ease things up for fast development with zero memory allocation and performance in mind.

Glide is a vendor Package Management for Golang.

BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast, distributed graph database. It's meant to be a performant alternative to non-Go-based key-value stores like RocksDB.

Go kit is a programming toolkit for building microservices (or elegant monoliths) in Go. We solve common problems in distributed systems and application architecture so you can focus on delivering business value.

Codis is a proxy based high performance Redis cluster solution written in Go.

zap is a blazing fast, structured, leveled logging in Go.

HttpRouter is a lightweight high performance HTTP request router (also called multiplexer or just mux for short) for Go.

Gorilla WebSocket is a Go implementation of the WebSocket protocol.

Delve is a debugger for the Go programming language.

GORM is a fantastic ORM library for Golang, aims to be developer friendly.

Go Patterns is a curated collection of idiomatic design & application patterns for Go language.

C/C++ Development

Back to the Top


C/C++ Learning Resources

C++ is a cross-platform language that can be used to build high-performance applications developed by Bjarne Stroustrup, as an extension to the C language.

C is a general-purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell Labs. It supports structured programming, lexical variable scope, and recursion, with a static type system. C also provides constructs that map efficiently to typical machine instructions, which makes it one was of the most widely used programming languages today.

Embedded C is a set of language extensions for the C programming language by the C Standards Committee to address issues that exist between C extensions for different embedded systems. The extensions hep enhance microprocessor features such as fixed-point arithmetic, multiple distinct memory banks, and basic I/O operations. This makes Embedded C the most popular embedded software language in the world.

C & C++ Developer Tools from JetBrains

Open source C++ libraries on cppreference.com

C++ Graphics libraries

C++ Libraries in MATLAB

C++ Tools and Libraries Articles

Google C++ Style Guide

Introduction C++ Education course on Google Developers

C++ style guide for Fuchsia

C and C++ Coding Style Guide by OpenTitan

Chromium C++ Style Guide

C++ Core Guidelines

C++ Style Guide for ROS

Learn C++

Learn C : An Interactive C Tutorial

C++ Institute

C++ Online Training Courses on LinkedIn Learning

C++ Tutorials on W3Schools

Learn C Programming Online Courses on edX

Learn C++ with Online Courses on edX

Learn C++ on Codecademy

Coding for Everyone: C and C++ course on Coursera

C++ For C Programmers on Coursera

Top C Courses on Coursera

C++ Online Courses on Udemy

Top C Courses on Udemy

Basics of Embedded C Programming for Beginners on Udemy

C++ For Programmers Course on Udacity

C++ Fundamentals Course on Pluralsight

Introduction to C++ on MIT Free Online Course Materials

Introduction to C++ for Programmers | Harvard

Online C Courses | Harvard University

C/C++ Tools and Frameworks

AWS SDK for C++

Azure SDK for C++

Azure SDK for C

C++ Client Libraries for Google Cloud Services

Visual Studio is an integrated development environment (IDE) from Microsoft; which is a feature-rich application that can be used for many aspects of software development. Visual Studio makes it easy to edit, debug, build, and publish your app. By using Microsoft software development platforms such as Windows API, Windows Forms, Windows Presentation Foundation, and Windows Store.

Visual Studio Code is a code editor redefined and optimized for building and debugging modern web and cloud applications.

Vcpkg is a C++ Library Manager for Windows, Linux, and MacOS.

ReSharper C++ is a Visual Studio Extension for C++ developers developed by JetBrains.

AppCode is constantly monitoring the quality of your code. It warns you of errors and smells and suggests quick-fixes to resolve them automatically. AppCode provides lots of code inspections for Objective-C, Swift, C/C++, and a number of code inspections for other supported languages. All code inspections are run on the fly.

CLion is a cross-platform IDE for C and C++ developers developed by JetBrains.

Code::Blocks is a free C/C++ and Fortran IDE built to meet the most demanding needs of its users. It is designed to be very extensible and fully configurable. Built around a plugin framework, Code::Blocks can be extended with plugins.

CppSharp is a tool and set of libraries which facilitates the usage of native C/C++ code with the .NET ecosystem. It consumes C/C++ header and library files and generates the necessary glue code to surface the native API as a managed API. Such an API can be used to consume an existing native library in your managed code or add managed scripting support to a native codebase.

Conan is an Open Source Package Manager for C++ development and dependency management into the 21st century and on par with the other development ecosystems.

High Performance Computing (HPC) SDK is a comprehensive toolbox for GPU accelerating HPC modeling and simulation applications. It includes the C, C++, and Fortran compilers, libraries, and analysis tools necessary for developing HPC applications on the NVIDIA platform.

Thrust is a C++ parallel programming library which resembles the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. Interoperability with established technologies such as CUDA, TBB, and OpenMP integrates with existing software.

Boost is an educational opportunity focused on cutting-edge C++. Boost has been a participant in the annual Google Summer of Code since 2007, in which students develop their skills by working on Boost Library development.

Automake is a tool for automatically generating Makefile.in files compliant with the GNU Coding Standards. Automake requires the use of GNU Autoconf.

Cmake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.

GDB is a debugger, that allows you to see what is going on `inside' another program while it executes or what another program was doing at the moment it crashed.

GCC is a compiler Collection that includes front ends for C, C++, Objective-C, Fortran, Ada, Go, and D, as well as libraries for these languages.

GSL is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.

OpenGL Extension Wrangler Library (GLEW) is a cross-platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform.

Libtool is a generic library support script that hides the complexity of using shared libraries behind a consistent, portable interface. To use Libtool, add the new generic library building commands to your Makefile, Makefile.in, or Makefile.am.

Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.

TAU (Tuning And Analysis Utilities) is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements as well as event-based sampling. All C++ language features are supported including templates and namespaces.

Clang is a production quality C, Objective-C, C++ and Objective-C++ compiler when targeting X86-32, X86-64, and ARM (other targets may have caveats, but are usually easy to fix). Clang is used in production to build performance-critical software like Google Chrome or Firefox.

OpenCV is a highly optimized library with focus on real-time applications. Cross-Platform C++, Python and Java interfaces support Linux, MacOS, Windows, iOS, and Android.

Libcu++ is the NVIDIA C++ Standard Library for your entire system. It provides a heterogeneous implementation of the C++ Standard Library that can be used in and between CPU and GPU code.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build parse trees and also generates a listener interface that makes it easy to respond to the recognition of phrases of interest.

Oat++ is a light and powerful C++ web framework for highly scalable and resource-efficient web application. It's zero-dependency and easy-portable.

JavaCPP is a program that provides efficient access to native C++ inside Java, not unlike the way some C/C++ compilers interact with assembly language.

Cython is a language that makes writing C extensions for Python as easy as Python itself. Cython is based on Pyrex, but supports more cutting edge functionality and optimizations such as calling C functions and declaring C types on variables and class attributes.

Spdlog is a very fast, header-only/compiled, C++ logging library.

Infer is a static analysis tool for Java, C++, Objective-C, and C. Infer is written in OCaml.

Rust Development

Back to the Top


Rust Learning Resources

Rust is a multi-paradigm programming language focused on performance and safety. Rust has a comparable amount of runtime to C and C++, and has set up its standard library to be amenable towards OS development. Specifically, the standard library is split into two parts: core and std. Core is the lowest-level aspects only, and doesn't include things like allocation, threading, and other higher-level features.

The Rust Language Reference

The Rust Programming Language Book

Learning Rust

Why AWS loves Rust

Rust Programming courses on Udemy

Safety in Systems Programming with Rust at Standford by Ryan Eberhardt

WebAssembly meets Kubernetes with Krustlet using Rust

Microsoft's Project Verona

Rust Tools and Frameworks

Cargo is a package manager that downloads your Rust project’s dependencies and compiles your project.

Crater is a tool to run experiments across parts of the Rust ecosystem. Its primary purpose is to detect regressions in the Rust compiler, and it does this by building a large number of crates, running their test suites and comparing the results between two versions of the Rust compiler. It can operate locally (with Docker as the only dependency) or distributed on the cloud. It can operate locally (with Docker as the only dependency) or distributed on the cloud.

VSCode-Rust is plugin that adds language support for Rust to Visual Studio Code. Rust support is powered by a separate language server - either by the official Rust Language Server (RLS) or rust-analyzer, depending on the user's preference. If you don't have it installed, the extension will install it for you (with permission). This extension is built and maintained by the Rust IDEs and editors team with the focus on providing a stable, high quality extension that makes the best use of the respective language server.

Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust.

Wasmer enables super lightweight containers based on WebAssembly that can run anywhere such as the Desktop to the Cloud and IoT devices, and also embedded in any programming language.

Firecracker is an open source virtualization technology that is purpose-built for creating and managing secure, multi-tenant container and function-based services that provide serverless operational models. Firecracker runs workloads in lightweight virtual machines, called microVMs, which combine the security and isolation properties provided by hardware virtualization technology with the speed and flexibility of containers. Firecracker has also been integrated in container runtimes, for example Kata Containers and Weaveworks Ignite.

Tokio is an event-driven, non-blocking I/O platform for writing asynchronous applications with the Rust programming language.

TiKV is an open-source distributed transactional key-value database that also provides classical key-vlue APIs, but also transactional APIs with ACID compliance.

Sonic is a fast, lightweight and schema-less search backend similar to Elasticsearch in some use-cases.

Hyper is a fast and correct HTTP library for Rust.

Rocket is an async web framework for Rust with a focus on usability, security, extensibility, and speed.

Clippy is a collection of lints to catch common mistakes and improve your Rust code.

Servo is a prototype web browser engine written in the Rust language.

Vector is a high-performance, end-to-end (agent & aggregator) observability data platform that puts the user in control of their observability data.

RustPython is a Python Interpreter written in Rust.

Miri is an interpreter for Rust's mid-level intermediate representation. It can run binaries and test suites of cargo projects and detect certain classes of undefined behavior. Miri will alsowill also tell you about memory leaks: when there is memory still allocated at the end of the execution, and that memory is not reachable from a global static, Miri will raise an error.

Chalk is an implementation and definition of the Rust trait system using a PROLOG-like logic solver.

stdarch is Rust's standard library vendor-specific APIs and run-time feature detection.

Simpleinfra is rep that contains the tools and automation written by the Rust infrastructure team to manage our services. Using some of the tools in this repo require privileges only infra team members have.

Rustlings is a small set of exercises to get you used to reading and writing Rust code.

Krustlet acts as a Kubernetes Kubelet(written in Rust) by listening on the event stream for new pods that the scheduler assigns to it based on specific Kubernetes tolerations. The project is currently experimental.

Rust-based Operating Systems

Redox is a Unix-like Operating System written in Rust, aiming to bring the innovations of Rust to a modern microkernel and full set of applications. Acitvely being developed by Jeremy Soeller.

Bottlerocket OS is an open-source Linux-based operating system meant for hosting containers. Bottlerocket focuses on security and maintainability, providing a reliable, consistent, and safe platform for container-based workloads.

Tock is an embedded operating system designed for running multiple concurrent, mutually distrustful applications on Cortex-M and RISC-V based embedded platforms. Tock's design centers around protection, both from potentially malicious applications and from device drivers. Tock uses two mechanisms to protect different components of the operating system. First, the kernel and device drivers are written in Rust, a systems programming language that provides compile-time memory safety, type safety and strict aliasing. Tock uses Rust to protect the kernel (the scheduler and hardware abstraction layer) from platform specific device drivers as well as isolate device drivers from each other. Second, Tock uses memory protection units to isolate applications from each other and the kernel.

Rust on Chrome OS is a document that provides information on creating Rust projects for installation within Chrome OS and Chrome OS SDK.

Writing an OS in Rust is a blog series creates a small operating system in the Rust programming language by Philipp Oppermann.

Networking

Back to the Top


Network Learning Resources

AWS Certified Security - Specialty Certification

Microsoft Certified: Azure Security Engineer Associate

Google Cloud Certified Professional Cloud Security Engineer

Cisco Security Certifications

The Red Hat Certified Specialist in Security: Linux

Linux Professional Institute LPIC-3 Enterprise Security Certification

Cybersecurity Training and Courses from IBM Skills

Cybersecurity Courses and Certifications by Offensive Security

Citrix Certified Associate – Networking(CCA-N)

Citrix Certified Professional – Virtualization(CCP-V)

CCNP Routing and Switching

Certified Information Security Manager(CISM)

Wireshark Certified Network Analyst (WCNA)

Juniper Networks Certification Program Enterprise (JNCP)

Networking courses and specializations from Coursera

Network & Security Courses from Udemy

Network & Security Courses from edX

Networking Tools & Concepts

Qt Network Authorization is a tool that provides a set of APIs that enable Qt applications to obtain limited access to online accounts and HTTP services without exposing users' passwords.

cURL is a computer software project providing a library and command-line tool for transferring data using various network protocols(HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP, DICT, TELNET, LDAP LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP or SMTPS). cURL is also used in cars, television sets, routers, printers, audio equipment, mobile phones, tablets, settop boxes, media players and is the Internet transfer engine for thousands of software applications in over ten billion installations.

cURL Fuzzer is a quality assurance testing for the curl project.

DoH is a stand-alone application for DoH (DNS-over-HTTPS) name resolves and lookups.

Authelia is an open-source highly-available authentication server providing single sign-on capability and two-factor authentication to applications running behind NGINX.

nginx(engine x) is an HTTP and reverse proxy server, a mail proxy server, and a generic TCP/UDP proxy server, originally written by Igor Sysoev.

Proxmox Virtual Environment(VE) is a complete open-source platform for enterprise virtualization. It inlcudes a built-in web interface that you can easily manage VMs and containers, software-defined storage and networking, high-availability clustering, and multiple out-of-the-box tools on a single solution.

Wireshark is a very popular network protocol analyzer that is commonly used for network troubleshooting, analysis, and communications protocol development. Learn more about the other useful Wireshark Tools available.

HTTPie is a command-line HTTP client. Its goal is to make CLI interaction with web services as human-friendly as possible. HTTPie is designed for testing, debugging, and generally interacting with APIs & HTTP servers.

HTTPStat is a tool that visualizes curl statistics in a simple layout.

Wuzz is an interactive cli tool for HTTP inspection. It can be used to inspect/modify requests copied from the browser's network inspector with the "copy as cURL" feature.

Websocat is a ommand-line client for WebSockets, like netcat (or curl) for ws:// with advanced socat-like functions.

  • Connection: In networking, a connection refers to pieces of related information that are transferred through a network. This generally infers that a connection is built before the data transfer (by following the procedures laid out in a protocol) and then is deconstructed at the at the end of the data transfer.

  • Packet: A packet is, generally speaking, the most basic unit that is transferred over a network. When communicating over a network, packets are the envelopes that carry your data (in pieces) from one end point to the other.

Packets have a header portion that contains information about the packet including the source and destination, timestamps, network hops. The main portion of a packet contains the actual data being transferred. It is sometimes called the body or the payload.

  • Network Interface: A network interface can refer to any kind of software interface to networking hardware. For instance, if you have two network cards in your computer, you can control and configure each network interface associated with them individually.

A network interface may be associated with a physical device, or it may be a representation of a virtual interface. The "loop-back" device, which is a virtual interface to the local machine, is an example of this.

  • LAN: LAN stands for "local area network". It refers to a network or a portion of a network that is not publicly accessible to the greater internet. A home or office network is an example of a LAN.

  • WAN: WAN stands for "wide area network". It means a network that is much more extensive than a LAN. While WAN is the relevant term to use to describe large, dispersed networks in general, it is usually meant to mean the internet, as a whole. If an interface is connected to the WAN, it is generally assumed that it is reachable through the internet.

  • Protocol: A protocol is a set of rules and standards that basically define a language that devices can use to communicate. There are a great number of protocols in use extensively in networking, and they are often implemented in different layers.

Some low level protocols are TCP, UDP, IP, and ICMP. Some familiar examples of application layer protocols, built on these lower protocols, are HTTP (for accessing web content), SSH, TLS/SSL, and FTP.

  • Port: A port is an address on a single machine that can be tied to a specific piece of software. It is not a physical interface or location, but it allows your server to be able to communicate using more than one application.

  • Firewall: A firewall is a program that decides whether traffic coming into a server or going out should be allowed. A firewall usually works by creating rules for which type of traffic is acceptable on which ports. Generally, firewalls block ports that are not used by a specific application on a server.

  • NAT: Network address translation is a way to translate requests that are incoming into a routing server to the relevant devices or servers that it knows about in the LAN. This is usually implemented in physical LANs as a way to route requests through one IP address to the necessary backend servers.

  • VPN: Virtual private network is a means of connecting separate LANs through the internet, while maintaining privacy. This is used as a means of connecting remote systems as if they were on a local network, often for security reasons.

Network Layers

While networking is often discussed in terms of topology in a horizontal way, between hosts, its implementation is layered in a vertical fashion throughout a computer or network. This means is that there are multiple technologies and protocols that are built on top of each other in order for communication to function more easily. Each successive, higher layer abstracts the raw data a little bit more, and makes it simpler to use for applications and users. It also allows you to leverage lower layers in new ways without having to invest the time and energy to develop the protocols and applications that handle those types of traffic.

As data is sent out of one machine, it begins at the top of the stack and filters downwards. At the lowest level, actual transmission to another machine takes place. At this point, the data travels back up through the layers of the other computer. Each layer has the ability to add its own "wrapper" around the data that it receives from the adjacent layer, which will help the layers that come after decide what to do with the data when it is passed off.

One method of talking about the different layers of network communication is the OSI model. OSI stands for Open Systems Interconnect.This model defines seven separate layers. The layers in this model are:

  • Application: The application layer is the layer that the users and user-applications most often interact with. Network communication is discussed in terms of availability of resources, partners to communicate with, and data synchronization.

  • Presentation: The presentation layer is responsible for mapping resources and creating context. It is used to translate lower level networking data into data that applications expect to see.

  • Session: The session layer is a connection handler. It creates, maintains, and destroys connections between nodes in a persistent way.

  • Transport: The transport layer is responsible for handing the layers above it a reliable connection. In this context, reliable refers to the ability to verify that a piece of data was received intact at the other end of the connection. This layer can resend information that has been dropped or corrupted and can acknowledge the receipt of data to remote computers.

  • Network: The network layer is used to route data between different nodes on the network. It uses addresses to be able to tell which computer to send information to. This layer can also break apart larger messages into smaller chunks to be reassembled on the opposite end.

  • Data Link: This layer is implemented as a method of establishing and maintaining reliable links between different nodes or devices on a network using existing physical connections.

  • Physical: The physical layer is responsible for handling the actual physical devices that are used to make a connection. This layer involves the bare software that manages physical connections as well as the hardware itself (like Ethernet).

The TCP/IP model, more commonly known as the Internet protocol suite, is another layering model that is simpler and has been widely adopted.It defines the four separate layers, some of which overlap with the OSI model:

  • Application: In this model, the application layer is responsible for creating and transmitting user data between applications. The applications can be on remote systems, and should appear to operate as if locally to the end user. The communication takes place between peers network.

  • Transport: The transport layer is responsible for communication between processes. This level of networking utilizes ports to address different services. It can build up unreliable or reliable connections depending on the type of protocol used.

  • Internet: The internet layer is used to transport data from node to node in a network. This layer is aware of the endpoints of the connections, but does not worry about the actual connection needed to get from one place to another. IP addresses are defined in this layer as a way of reaching remote systems in an addressable manner.

  • Link: The link layer implements the actual topology of the local network that allows the internet layer to present an addressable interface. It establishes connections between neighboring nodes to send data.

Interfaces

Interfaces are networking communication points for your computer. Each interface is associated with a physical or virtual networking device. Typically, your server will have one configurable network interface for each Ethernet or wireless internet card you have. In addition, it will define a virtual network interface called the "loopback" or localhost interface. This is used as an interface to connect applications and processes on a single computer to other applications and processes. You can see this referenced as the "lo" interface in many tools.

Network Protocols

Networking works by piggybacks on a number of different protocols on top of each other. In this way, one piece of data can be transmitted using multiple protocols encapsulated within one another.

Media Access Control(MAC) is a communications protocol that is used to distinguish specific devices. Each device is supposed to get a unique MAC address during the manufacturing process that differentiates it from every other device on the internet. Addressing hardware by the MAC address allows you to reference a device by a unique value even when the software on top may change the name for that specific device during operation. Media access control is one of the only protocols from the link layer that you are likely to interact with on a regular basis.

The IP protocol is one of the fundamental protocols that allow the internet to work. IP addresses are unique on each network and they allow machines to address each other across a network. It is implemented on the internet layer in the IP/TCP model. Networks can be linked together, but traffic must be routed when crossing network boundaries. This protocol assumes an unreliable network and multiple paths to the same destination that it can dynamically change between. There are a number of different implementations of the protocol. The most common implementation today is IPv4, although IPv6 is growing in popularity as an alternative due to the scarcity of IPv4 addresses available and improvements in the protocols capabilities.

ICMP: internet control message protocol is used to send messages between devices to indicate the availability or error conditions. These packets are used in a variety of network diagnostic tools, such as ping and traceroute. Usually ICMP packets are transmitted when a packet of a different kind meets some kind of a problem. Basically, they are used as a feedback mechanism for network communications.

TCP: Transmission control protocol is implemented in the transport layer of the IP/TCP model and is used to establish reliable connections. TCP is one of the protocols that encapsulates data into packets. It then transfers these to the remote end of the connection using the methods available on the lower layers. On the other end, it can check for errors, request certain pieces to be resent, and reassemble the information into one logical piece to send to the application layer. The protocol builds up a connection prior to data transfer using a system called a three-way handshake. This is a way for the two ends of the communication to acknowledge the request and agree upon a method of ensuring data reliability. After the data has been sent, the connection is torn down using a similar four-way handshake. TCP is the protocol of choice for many of the most popular uses for the internet, including WWW, FTP, SSH, and email. It is safe to say that the internet we know today would not be here without TCP.

UDP: User datagram protocol is a popular companion protocol to TCP and is also implemented in the transport layer. The fundamental difference between UDP and TCP is that UDP offers unreliable data transfer. It does not verify that data has been received on the other end of the connection. This might sound like a bad thing, and for many purposes, it is. However, it is also extremely important for some functions. It’s not required to wait for confirmation that the data was received and forced to resend data, UDP is much faster than TCP. It does not establish a connection with the remote host, it simply fires off the data to that host and doesn't care if it is accepted or not. Since UDP is a simple transaction, it is useful for simple communications like querying for network resources. It also doesn't maintain a state, which makes it great for transmitting data from one machine to many real-time clients. This makes it ideal for VOIP, games, and other applications that cannot afford delays.

HTTP: Hypertext transfer protocol is a protocol defined in the application layer that forms the basis for communication on the web. HTTP defines a number of functions that tell the remote system what you are requesting. For instance, GET, POST, and DELETE all interact with the requested data in a different way.

FTP: File transfer protocol is in the application layer and provides a way of transferring complete files from one host to another. It is inherently insecure, so it is not recommended for any externally facing network unless it is implemented as a public, download-only resource.

DNS: Domain name system is an application layer protocol used to provide a human-friendly naming mechanism for internet resources. It is what ties a domain name to an IP address and allows you to access sites by name in your browser.

SSH: Secure shell is an encrypted protocol implemented in the application layer that can be used to communicate with a remote server in a secure way. Many additional technologies are built around this protocol because of its end-to-end encryption and ubiquity. There are many other protocols that we haven't covered that are equally important. However, this should give you a good overview of some of the fundamental technologies that make the internet and networking possible.

REST(REpresentational State Transfer) is an architectural style for providing standards between computer systems on the web, making it easier for systems to communicate with each other.

JSON Web Token (JWT) is a compact URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is digitally signed using JSON Web Signature (JWS).

OAuth 2.0 is an open source authorization framework that enables applications to obtain limited access to user accounts on an HTTP service, such as Amazon, Google, Facebook, Microsoft, Twitter GitHub, and DigitalOcean. It works by delegating user authentication to the service that hosts the user account, and authorizing third-party applications to access the user account.

Kubernetes

Back to the Top


Kubernetes Learning Resources

Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications.

Getting Kubernetes Certifications

Getting started with Kubernetes on AWS

Kubernetes on Microsoft Azure

Intro to Azure Kubernetes Service

Azure Red Hat OpenShift

Getting started with Google Cloud

Getting started with Kubernetes on Red Hat

Getting started with Kubernetes on IBM

Red Hat OpenShift on IBM Cloud

Enable OpenShift Virtualization on Red Hat OpenShift

YAML basics in Kubernetes

Elastic Cloud on Kubernetes

Docker and Kubernetes

Running Apache Spark on Kubernetes

Kubernetes Across VMware vRealize Automation

VMware Tanzu Kubernetes Grid

All the Ways VMware Tanzu Works with AWS

VMware Tanzu Education

Using Ansible in a Cloud-Native Kubernetes Environment

Managing Kubernetes (K8s) objects with Ansible

Setting up a Kubernetes cluster using Vagrant and Ansible

Running MongoDB with Kubernetes

Kubernetes Fluentd

Understanding the new GitLab Kubernetes Agent

Intro Local Process with Kubernetes for Visual Studio 2019

Kubernetes Contributors

KubeAcademy from VMware

Kubernetes Tutorials from Pulumi

Kubernetes Playground by Katacoda

Scalable Microservices with Kubernetes course from Udacity

Kubernetes Tools, Frameworks, and Projects

Open Container Initiative is an open governance structure for the express purpose of creating open industry standards around container formats and runtimes.

Buildah is a command line tool to build Open Container Initiative (OCI) images. It can be used with Docker, Podman, Kubernetes.

Podman is a daemonless, open source, Linux native tool designed to make it easy to find, run, build, share and deploy applications using Open Containers Initiative (OCI) Containers and Container Images. Podman provides a command line interface (CLI) familiar to anyone who has used the Docker Container Engine.

Containerd is a daemon that manages the complete container lifecycle of its host system, from image transfer and storage to container execution and supervision to low-level storage to network attachments and beyond. It is available for Linux and Windows.

Google Kubernetes Engine (GKE) is a managed, production-ready environment for running containerized applications.

Azure Kubernetes Service (AKS) is serverless Kubernetes, with a integrated continuous integration and continuous delivery (CI/CD) experience, and enterprise-grade security and governance. Unite your development and operations teams on a single platform to rapidly build, deliver, and scale applications with confidence.

Amazon EKS is a tool that runs Kubernetes control plane instances across multiple Availability Zones to ensure high availability.

AWS Controllers for Kubernetes (ACK) is a new tool that lets you directly manage AWS services from Kubernetes. ACK makes it simple to build scalable and highly-available Kubernetes applications that utilize AWS services.

Container Engine for Kubernetes (OKE) is an Oracle-managed container orchestration service that can reduce the time and cost to build modern cloud native applications. Unlike most other vendors, Oracle Cloud Infrastructure provides Container Engine for Kubernetes as a free service that runs on higher-performance, lower-cost compute.

Anthos is a modern application management platform that provides a consistent development and operations experience for cloud and on-premises environments.

Red Hat Openshift is a fully managed Kubernetes platform that provides a foundation for on-premises, hybrid, and multicloud deployments.

OKD is a community distribution of Kubernetes optimized for continuous application development and multi-tenant deployment. OKD adds developer and operations-centric tools on top of Kubernetes to enable rapid application development, easy deployment and scaling, and long-term lifecycle maintenance for small and large teams.

Odo is a fast, iterative, and straightforward CLI tool for developers who write, build, and deploy applications on Kubernetes and OpenShift.

Kata Operator is an operator to perform lifecycle management (install/upgrade/uninstall) of Kata Runtime on Openshift as well as Kubernetes cluster.

Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added seamlessly on top of existing Prometheus deployments.

OpenShift Hive is an operator which runs as a service on top of Kubernetes/OpenShift. The Hive service can be used to provision and perform initial configuration of OpenShift 4 clusters.

Rook is a tool that turns distributed storage systems into self-managing, self-scaling, self-healing storage services. It automates the tasks of a storage administrator: deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management.

VMware Tanzu is a centralized management platform for consistently operating and securing your Kubernetes infrastructure and modern applications across multiple teams and private/public clouds.

Kubespray is a tool that combines Kubernetes and Ansible to easily install Kubernetes clusters that can be deployed on AWS, GCE, Azure, OpenStack, vSphere, Packet (bare metal), Oracle Cloud Infrastructure (Experimental), or Baremetal.

KubeInit provides Ansible playbooks and roles for the deployment and configuration of multiple Kubernetes distributions.

Rancher is a complete software stack for teams adopting containers. It addresses the operational and security challenges of managing multiple Kubernetes clusters, while providing DevOps teams with integrated tools for running containerized workloads.

K3s is a highly available, certified Kubernetes distribution designed for production workloads in unattended, resource-constrained, remote locations or inside IoT appliances.

Helm is a Kubernetes Package Manager tool that makes it easier to install and manage Kubernetes applications.

Knative is a Kubernetes-based platform to build, deploy, and manage modern serverless workloads. Knative takes care of the operational overhead details of networking, autoscaling (even to zero), and revision tracking.

KubeFlow is a tool dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.

Kubebox is a Terminal and Web console for Kubernetes.

Kubsec is a Security risk analysis for Kubernetes resources.

Replex is a Kubernetes Governance and Cost Management for the Cloud-Native Enterprise.

Virtual Kubelet is an open-source Kubernetes kubelet implementation that masquerades as a kubelet.

Telepresence is a fast, local development for Kubernetes and OpenShift microservices.

Weave Scope is a tool that automatically detects processes, containers, hosts. No kernel modules, no agents, no special libraries, no coding. It seamless integration with Docker, Kubernetes, DCOS and AWS ECS.

Nuclio is a high-performance "serverless" framework focused on data, I/O, and compute intensive workloads. It is well integrated with popular data science tools, such as Jupyter and Kubeflow; supports a variety of data and streaming sources; and supports execution over CPUs and GPUs.

Supergiant Control is a tool that manages the lifecycle of clusters on your infrastructure and allows deployment of applications via HELM. Its deployment and configuration workflows will help you to get up and running with Kubernetes faster.

Supergiant Capacity - Beta is a tool that ensures that the right hardware is available for the required resource load of your Kubernetes cluster at any given time. This helps prevent over-provisioning of your container environment and overspending on your hardware budget.

Test suite for Kubernetes is a test suite consists of two Helm charts for network bandwith testing and load testing a Kuberntes cluster.

Keel is a Kubernetes Operator to automate Helm, DaemonSet, StatefulSet & Deployment updates.

Kube Monkey is an implementation of Netflix's Chaos Monkey for Kubernetes clusters. It randomly deletes Kubernetes (k8s) pods in the cluster encouraging and validating the development of failure-resilient services.

Kube State Metrics (KSM) is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. It's not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods.

Sonobuoy is a diagnostic tool that makes it easier to understand the state of a Kubernetes cluster by running a choice of configuration tests in an accessible and non-destructive manner.

PowerfulSeal is a powerful testing tool for your Kubernetes clusters, so that you can detect problems as early as possible.

Test Infra is a repository contains tools and configuration files for the testing and automation needs of the Kubernetes project.

cAdvisor (Container Advisor) is a tool that provides container users an understanding of the resource usage and performance characteristics of their running containers. It's a running daemon that collects, aggregates, processes, and exports information about running containers. Specifically, for each container it keeps resource isolation parameters, historical resource usage, histograms of complete historical resource usage and network statistics.

Etcd is a distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. Etcd is used as the backend for service discovery and stores cluster state and configuration for Kubernetes.

OpenEBS is a Kubernetes-based tool to create stateful applications using Container Attached Storage.

Container Storage Interface (CSI) is an API that lets container orchestration platforms like Kubernetes seamlessly communicate with stored data via a plug-in.

MicroK8s is a tool that delivers the full Kubernetes experience. In a Fully containerized deployment with compressed over-the-air updates for ultra-reliable operations. It is supported on Linux, Windows, and MacOS.

Charmed Kubernetes is a well integrated, turn-key, conformant Kubernetes platform, optimized for your multi-cloud environments developed by Canonical.

Grafana Kubernetes App is a toll that allows you to monitor your Kubernetes cluster's performance. It includes 4 dashboards, Cluster, Node, Pod/Container and Deployment. It allows for the automatic deployment of the required Prometheus exporters and a default scrape config to use with your in cluster Prometheus deployment.

KubeEdge is an open source system for extending native containerized application orchestration capabilities to hosts at Edge.It is built upon kubernetes and provides fundamental infrastructure support for network, app. deployment and metadata synchronization between cloud and edge.

Lens is the most powerful IDE for people who need to deal with Kubernetes clusters on a daily basis. It has support for MacOS, Windows and Linux operating systems.

kind is a tool for running local Kubernetes clusters using Docker container “nodes”. It was primarily designed for testing Kubernetes itself, but may be used for local development or CI.

Flux CD is a tool that automatically ensures that the state of your Kubernetes cluster matches the configuration you've supplied in Git. It uses an operator in the cluster to trigger deployments inside Kubernetes, which means that you don't need a separate continuous delivery tool.

Databases

Back to the Top



SQL/NoSQL Learning Resources

SQL is a standard language for storing, manipulating and retrieving data in relational databases.

NoSQL is a database that is interchangeably referred to as "nonrelational, or "non-SQL" to highlight that the database can handle huge volumes of rapidly changing, unstructured data in different ways than a relational (SQL-based) database with rows and tables.

Transact-SQL(T-SQL) is a Microsoft extension of SQL with all of the tools and applications communicating to a SQL database by sending T-SQL commands.

Introduction to Transact-SQL

SQL Tutorial by W3Schools

Learn SQL Skills Online from Coursera

SQL Courses Online from Udemy

SQL Online Training Courses from LinkedIn Learning

Learn SQL For Free from Codecademy

GitLab's SQL Style Guide

OracleDB SQL Style Guide Basics

Tableau CRM: BI Software and Tools

Databases on AWS

Best Practices and Recommendations for SQL Server Clustering in AWS EC2.

Connecting from Google Kubernetes Engine to a Cloud SQL instance.

Educational Microsoft Azure SQL resources

MySQL Certifications

SQL vs. NoSQL Databases: What's the Difference?

What is NoSQL?

SQL/NoSQL Tools and Databases

Netdata is high-fidelity infrastructure monitoring and troubleshooting, real-time monitoring Agent collects thousands of metrics from systems, hardware, containers, and applications with zero configuration. It runs permanently on all your physical/virtual servers, containers, cloud deployments, and edge/IoT devices, and is perfectly safe to install on your systems mid-incident without any preparation.

Azure Data Studio is an open source data management tool that enables working with SQL Server, Azure SQL DB and SQL DW from Windows, macOS and Linux.

Azure SQL Database is the intelligent, scalable, relational database service built for the cloud. It’s evergreen and always up to date, with AI-powered and automated features that optimize performance and durability for you. Serverless compute and Hyperscale storage options automatically scale resources on demand, so you can focus on building new applications without worrying about storage size or resource management.

Azure SQL Managed Instance is a fully managed SQL Server Database engine instance that's hosted in Azure and placed in your network. This deployment model makes it easy to lift and shift your on-premises applications to the cloud with very few application and database changes. Managed instance has split compute and storage components.

Azure Synapse Analytics is a limitless analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either serverless or provisioned resources at scale. It brings together the best of the SQL technologies used in enterprise data warehousing, Spark technologies used in big data analytics, and Pipelines for data integration and ETL/ELT.

MSSQL for Visual Studio Code is an extension for developing Microsoft SQL Server, Azure SQL Database and SQL Data Warehouse everywhere with a rich set of functionalities.

SQL Server Data Tools (SSDT) is a development tool for building SQL Server relational databases, Azure SQL Databases, Analysis Services (AS) data models, Integration Services (IS) packages, and Reporting Services (RS) reports. With SSDT, a developer can design and deploy any SQL Server content type with the same ease as they would develop an application in Visual Studio or Visual Studio Code.

Bulk Copy Program is a command-line tool that comes with Microsoft SQL Server. BCP, allows you to import and export large amounts of data in and out of SQL Server databases quickly snd efficeiently.

SQL Server Migration Assistant is a tool from Microsoft that simplifies database migration process from Oracle to SQL Server, Azure SQL Database, Azure SQL Database Managed Instance and Azure SQL Data Warehouse.

SQL Server Integration Services is a development platform for building enterprise-level data integration and data transformations solutions. Use Integration Services to solve complex business problems by copying or downloading files, loading data warehouses, cleansing and mining data, and managing SQL Server objects and data.

SQL Server Business Intelligence(BI) is a collection of tools in Microsoft's SQL Server for transforming raw data into information businesses can use to make decisions.

Tableau is a Data Visualization software used in relational databases, cloud databases, and spreadsheets. Tableau was acquired by Salesforce in August 2019.

DataGrip is a professional DataBase IDE developed by Jet Brains that provides context-sensitive code completion, helping you to write SQL code faster. Completion is aware of the tables structure, foreign keys, and even database objects created in code you're editing.

RStudio is an integrated development environment for R and Python, with a console, syntax-highlighting editor that supports direct code execution, and tools for plotting, history, debugging and workspace management.

MySQL is a fully managed database service to deploy cloud-native applications using the world's most popular open source database.

PostgreSQL is a powerful, open source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.

Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It is a fully managed, multiregion, multimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications.

Apache Cassandra™ is an open source NoSQL distributed database trusted by thousands of companies for scalability and high availability without compromising performance. Cassandra provides linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.

Apache HBase™ is an open-source, NoSQL, distributed big data store. It enables random, strictly consistent, real-time access to petabytes of data. HBase is very effective for handling large, sparse datasets. HBase serves as a direct input and output to the Apache MapReduce framework for Hadoop, and works with Apache Phoenix to enable SQL-like queries over HBase tables.

Hadoop Distributed File System (HDFS) is a distributed file system that handles large data sets running on commodity hardware. It is used to scale a single Apache Hadoop cluster to hundreds (and even thousands) of nodes. HDFS is one of the major components of Apache Hadoop, the others being MapReduce and YARN.

Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other frameworks on a dynamically shared pool of nodes.

Apache Spark is a unified analytics engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.

ElasticSearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java.

Logstash is a tool for managing events and logs. When used generically, the term encompasses a larger system of log collection, processing, storage and searching activities.

Kibana is an open source data visualization plugin for Elasticsearch. It provides visualization capabilities on top of the content indexed on an Elasticsearch cluster. Users can create bar, line and scatter plots, or pie charts and maps on top of large volumes of data.

Trino is a Distributed SQL query engine for big data. It is able to tremendously speed up ETL processes, allow them all to use standard SQL statement, and work with numerous data sources and targets all in the same system.

Extract, transform, and load (ETL) is a data pipeline used to collect data from various sources, transform the data according to business rules, and load it into a destination data store.

Redis(REmote DIctionary Server) is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. It provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.

FoundationDB is an open source distributed database designed to handle large volumes of structured data across clusters of commodity servers. It organizes data as an ordered key-value store and employs ACID transactions for all operations. It is especially well-suited for read/write workloads but also has excellent performance for write-intensive workloads. FoundationDB was acquired by Apple in 2015.

IBM DB2 is a collection of hybrid data management products offering a complete suite of AI-empowered capabilities designed to help you manage both structured and unstructured data on premises as well as in private and public cloud environments. Db2 is built on an intelligent common SQL engine designed for scalability and flexibility.

MongoDB is a document database meaning it stores data in JSON-like documents.

OracleDB is a powerful fully managed database helps developers manage business-critical data with the highest availability, reliability, and security.

MariaDB is an enterprise open source database solution for modern, mission-critical applications.

SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.SQLite is the most used database engine in the world. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day.

SQLite Database Browser is an open source SQL tool that allows users to create, design and edits SQLite database files. It lets users show a log of all the SQL commands that have been issued by them and by the application itself.

InfluxDB is an open source time series platform. This includes APIs for storing and querying data, processing it in the background for ETL or monitoring and alerting purposes, user dashboards, Internet of Things sensor data, and visualizing and exploring the data and more. It also has support for processing data from Graphite.

Atlas is an in-memory dimensional time series database.

CouchbaseDB is an open source distributed multi-model NoSQL document-oriented database. It creates a key-value store with managed cache for sub-millisecond data operations, with purpose-built indexers for efficient queries and a powerful query engine for executing SQL queries.

dbWatch is a complete database monitoring/management solution for SQL Server, Oracle, PostgreSQL, Sybase, MySQL and Azure. Designed for proactive management and automation of routine maintenance in large scale on-premise, hybrid/cloud database environments.

Cosmos DB Profiler is a real-time visual debugger allowing a development team to gain valuable insight and perspective into their usage of Cosmos DB database. It identifies over a dozen suspicious behaviors from your application’s interaction with Cosmos DB.

Adminer is an SQL management client tool for managing databases, tables, relations, indexes, users. Adminer has support for all the popular database management systems such as MySQL, MariaDB, PostgreSQL, SQLite, MS SQL, Oracle, Firebird, SimpleDB, Elasticsearch and MongoDB.

DBeaver is an open source database tool for developers and database administrators. It offers supports for JDBC compliant databases such as MySQL, Oracle, IBM DB2, SQL Server, Firebird, SQLite, Sybase, Teradata, Firebird, Apache Hive, Phoenix, and Presto.

DbVisualizer is a SQL management tool that allows users to manage a wide range of databases such as Oracle, Sybase, SQL Server, MySQL, H3, and SQLite.

AppDynamics Database is a management product for Microsoft SQL Server. With AppDynamics you can monitor and trend key performance metrics such as resource consumption, database objects, schema statistics and more, allowing you to proactively tune and fix issues in a High-Volume Production Environment.

Toad is a SQL Server DBMS toolset developed by Quest. It increases productivity by using extensive automation, intuitive workflows, and built-in expertise. This SQL management tool resolve issues, manage change and promote the highest levels of code quality for both relational and non-relational databases.

Lepide SQL Server is an open source storage manager utility to analyse the performance of SQL Servers. It provides a complete overview of all configuration and permission changes being made to your SQL Server environment through an easy-to-use, graphical user interface.

Sequel Pro is a fast MacOS database management tool for working with MySQL. This SQL management tool helpful for interacting with your database by easily to adding new databases, new tables, and new rows.

Contribute

  • [x] If would you like to contribute to this guide simply make a Pull Request.

License

Back to the Top

Distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) Public License.