Plugin System Overview
Krustlet partially implements support for CSI and device plugins. For CSI
plugins support, Krustlet partially implements the plugin discovery system used
by the mainline Kubelet. Upon investigation/ reverse engineering, we determined
that CSI and device plugins use different APIs, the plugin
registration
and device
plugin
APIs, respectively. CSI plugins use the auto plugin discovery method implemented
here.
You can see other evidence of this in the csi-common
code
and the Node Driver Registrar
documentation.
Instead of watching for plugins as done by the CSI pluginwatcher
package
in the kubelet, the kubelet
devicemanager
hosts a registration service for device plugins, as described in the device
plugin
documentation.
CSI Plugins
Registration: How does it work?
The plugin registration system has an event driven loop for discovering and registering plugins:
- Kubelet using a file system watcher to watch the given directory
- Plugins wishing to register themselves with Kubelet must open a Unix domain socket (henceforth referred to as just “socket”) in the watched directory
- When Kubelet detects a new socket, it connects to the discovered socket and
attempts to do a
GetInfo
gRPC call. - Using the info returned from the
GetInfo
call, Kubelet performs validation to make sure it supports the correct version of the API requested by the plugin and that the plugin is not already registered. If it is aCSIPlugin
type, the info will also contain another path to a socket where the CSI driver is listening - If validation succeeds, Kubelet makes a
NotifyRegistrationStatus
gRPC call on the originally discovered socket to inform the plugin that it has successfully registered
Additional information
In normal Kubernetes land, most CSI plugins register themselves with the kubelet using the Node Driver Registrar sidecar container that runs with the actual CSI driver. It has the responsibility for creating the socket that the kubelet discovers.
Device Plugins
Krustlet supports Kubernetes device
plugins,
which enable Kubernetes workloads to request extended resources, such as
hardware, advertised by device plugins. Krustlet implements the Kubernetes
device
manager
in the kubelet’s resources
module. It
implements the device plugin framework’s Registration
gRPC
service.
Flow from registering device plugin (DP) to running a Pod requesting an DP extended resource:
- The kubelet’s
DeviceManager
hosts the device plugin framework’sRegistration
gRPC service on the Kubernetes default/var/lib/kubelet/device-plugins/kubelet.sock
. - DP registers itself with the kubelet through this gRPC service. This allows the DP to advertise a resource such as system hardware to kubelet.
- The kubelet creates a
PluginConnection
for marshalling requests to the DP. It calls the DP’sListAndWatch
service, creating a bi-directional streaming connection. The device plugin updates the kubelet about the device health across this connection. - Each time the
PluginConnection
receives device updates across theListAndWatch
connection. It updates the map of all devices (DeviceMap
) shared between theDeviceManager
,PluginConnections
andNodePatcher
and notifies theNodePatcher
to update theNodeStatus
of the node with appropriateallocatable
andcapacity
entries. - Once a Pod is applied that requests the resource advertized by the DP (say
example.com/mock-plugin
). Then the K8s scheduler can schedule the Pod to this node, since the requested resource isallocatable
in theNodeSpec
. During theResources
state, if a DP resource is requested, thePluginConnection
callsAllocate
on the DP, requesting use of the resource. - If the Pod is terminated, in order to free up the DP resource, the
DeviceManager
contains aPodDevices
structure that queries K8s Api for currently running Pods before each allocate call. It then will update it’s map of allocated devices to remove terminated Pods. - If the DP dies and the connection is dropped, the devices are removed from
the
DeviceMap
and theNodePatcher
zeroscapacity
andallocatable
for the resource in the NodeSpec.
What is not supported?
The current implementation does not support the following:
- Calls to a device plugin’s
GetPreferredAllocation
endpoint in order to make more informedAllocate
calls. - Each
ContainerAllocateResponse
contains environment variables, mounts, device specs, and annotations that should be set in Pods that request the resource. Currently, Krustlet only supports environment variables and a subset ofMounts
, namelyhost_path
mounts as volumes. - Does not consider
Device::TopologyInfo
, as the Topology Manager has not been implemented in Krustlet.