The Container.State struct holds the container's state, and most of
its fields are expected to change dynamically. Some o these state-changes
are explicit, for example, setting the container to be "stopped". Other
state changes can be more explicit, for example due to the containers'
process exiting or being "OOM" killed by the kernel.
The distinction between explicit ("desired") state changes and "state"
("actual state") is sometimes vague; for some properties, we clearly
separated them, for example if a user requested the container to be
stopped or restarted, we store state in the Container object itself;
HasBeenManuallyStopped bool // used for unless-stopped restart policy
HasBeenManuallyRestarted bool `json:"-"` // used to distinguish restart caused by restart policy from the manual one
Other properties are more ambiguous. such as "HasBeenStartedBefore" and
"RestartCount", which are stored on the Container (and persisted to
disk), but may be more related to "actual" state, and likely should
not be persisted;
RestartCount int
HasBeenStartedBefore bool
Given that (per the above) concurrency must be taken into account, most
changes to the `container.State` struct should be protected; here's where
things get blurry. While the `State` type provides various accessor methods,
only some of them take concurrency into account; for example, [State.IsRunning]
and [State.GetPID] acquire a lock, whereas [State.ExitCodeValue] does not.
Even the (commonly used) [State.StateString] has no locking at all.
The way to handle this is error-prone; [container.State] contains a mutex,
and it's exported. Given that its embedded in the [container.Container]
struct, it's also exposed as an exported mutex for the container. The
assumption here is that by "merging" the two, the caller to acquire a lock
when either the container _or_ its state must be mutated. However, because
some methods on `container.State` handle their own locking, consumers must
be deeply familiar with the internals; if both changes to the `Container`
AND `Container.State` must be made. This gets amplified more as some
(exported!) methods, such as [container.SetRunning] mutate multiple fields,
but don't acquire a lock (so expect the caller to hold one), but their
(also exported) counterpart (e.g. [State.IsRunning]) do.
It should be clear from the above, that this needs some architectural
changes; a clearer separation between "desired" and "actual" state (opening
the potential to update the container's config without manually touching
its `State`), possibly a method to obtain a read-only copy of the current
state (for those querying state), and reviewing which fields belong where
(and should be persisted to disk, or only remain in memory).
This PR preserves the status quo; it makes no structural changes, other
than exposing where we access the container's state. Where previously the
State fields and methods were referred to as "part of the container"
(e.g. `ctr.IsRunning()` or `ctr.Running`), we now explicitly reference
the embedded `State` (`ctr.State.IsRunning`, `ctr.State.Running`).
The exception (for now) is the mutex, which is still referenced through
the embedded struct (`ctr.Lock()` instead of `ctr.State.Lock()`), as this
is (mostly) by design to protect the container, and what's in it (including
its `State`).
[State.IsRunning]: c4afa77157/daemon/container/state.go (L205-L209)
[State.GetPID]: c4afa77157/daemon/container/state.go (L211-L216)
[State.ExitCodeValue]: c4afa77157/daemon/container/state.go (L218-L228)
[State.StateString]: c4afa77157/daemon/container/state.go (L102-L131)
[container.State]: c4afa77157/daemon/container/state.go (L15-L23)
[container.Container]: c4afa77157/daemon/container/container.go (L67-L75)
[container.SetRunning]: c4afa77157/daemon/container/state.go (L230-L277)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the errdefs utilities to make sure we correctly detect the type
of error if a containerd errdefs type is returned.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "backend" types in API were designed to decouple the API server
implementation from the daemon, or other parts of the code that
back the API server. This would allow the daemon to evolve (e.g.
functionality moved to different subsystems) without that impacting
the API server's implementation.
Now that the API server is no longer part of the API package (module),
there is no benefit to having it in the API module. The API server
may evolve (and require changes in the backend), which has no direct
relation with the API module (types, responses); the backend definition
is, however, coupled to the API server implementation.
It's worth noting that, while "technically" possible to use the API
server package, and implement an alternative backend implementation,
this has never been a prime objective. The backend definition was
never considered "stable", and we don't expect external users to
(attempt) to use it as such.
This patch moves the backend types to the daemon/server package,
so that they can evolve with the daemon and API server implementation
without that impacting the API module (which we intend to be stable,
following SemVer).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.
Remove these imports in preparation of migrating our code to become an
actual go module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This works around issues with the otel http handler wrapper causing
multiple calls to `WriteHeader` when a `Flush` is called before `Write`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
API v1.23 and older are deprecated, so we can remove the code to adjust
responses for API v1.20 and lower.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.
This patch moves our own uses of the package to use the new module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit moves one-shot stats processing out of the publishing
channels, i.e. collect stats directly.
Also changes the method of getSystemCPUUsage() on Linux to return
number of online CPUs also.
Signed-off-by: Xinfeng Liu <XinfengLiu@icloud.com>
Metrics collectors generally don't need the daemon to prime the stats
with something to compare since they already have something to compare
with.
Before this change, the API does 2 collection cycles (which takes
roughly 2s) in order to provide comparison for CPU usage over 1s. This
was primarily added so that `docker stats --no-stream` had something to
compare against.
Really the CLI should have just made a 2nd call and done the comparison
itself rather than forcing it on all API consumers.
That ship has long sailed, though.
With this change, clients can set an option to just pull a single stat,
which is *at least* a full second faster:
Old:
```
time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=false > /dev/null
2>&1
real0m1.864s
user0m0.005s
sys0m0.007s
time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=false > /dev/null
2>&1
real0m1.173s
user0m0.010s
sys0m0.006s
```
New:
```
time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=true > /dev/null
2>&1
real0m0.680s
user0m0.008s
sys0m0.004s
time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=true > /dev/null
2>&1
real0m0.156s
user0m0.007s
sys0m0.007s
```
This fixes issues with downstreams ability to use the stats API to
collect metrics.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Since Go 1.7, context is a standard package. Since Go 1.9, everything
that is provided by "x/net/context" is a couple of type aliases to
types in "context".
Many vendored packages still use x/net/context, so vendor entry remains
for now.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When `docker stats` stopped containers, client will get empty stats data,
this commit will gurantee client always get "Name" and "ID" field, so
that it can format with `ID` and `Name` fields successfully.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
In case, a container is restarting indefinitely running
"docker stats --no-stream <restarting_container>" is suspended.
To fix this, the daemon makes sure the container is either not
running or restarting if `--no-stream` is set to true and if so
returns an empty stats.
Should fix#27772.
Signed-off-by: Boaz Shuster <ripcurld.github@gmail.com>
This adds support to display names or id of container instead of what
was provided in the request.
This keeps the default behavior (`docker stats byname` will display
`byname` in the `CONTAINER` colmun and `docker stats byid` will display
the id in the `CONTAINER` column) but adds two new format directive.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
This fix tries to address the issue in 25000 where `docker stats`
will not show network stats with `NetworkDisabled=true`.
The `NetworkDisabled=true` could be either invoked through
remote API, or through `docker daemon -b none`.
The issue was that when `NetworkDisabled=true` either by API or
by daemon config, there is no SandboxKey for container so an error
will be returned.
This fix fixes this issue by skipping obtaining SandboxKey if
`NetworkDisabled=true`.
Additional test has bee added to cover the changes.
This fix fixes 25000.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Fixes#22030
Because the publisher uses this same value to all the
stats endpoints we need to make a copy of this as soon as we get it so
that we can make our modifications without it affecting others.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This is done by moving the following types to api/types/config.go:
- ContainersConfig
- ContainerAttachWithLogsConfig
- ContainerWsAttachWithLogsConfig
- ContainerLogsConfig
- ContainerStatsConfig
Remove dependency on "version" package from types.ContainerStatsConfig.
Decouple the "container" router from the "daemon/exec" implementation.
* This is done by making daemon.ContainerExecInspect() return an interface{}
value. The same trick is already used by daemon.ContainerInspect().
Improve documentation for router packages.
Extract localRoute and router into separate files.
Move local.router to image.imageRouter.
Changes:
- Move local/image.go to image/image_routes.go.
- Move local/local.go to image/image.go
- Rename router to imageRouter.
- Simplify imports for image/image.go (remove alias for router package).
Merge router/local package into router package.
Decouple the "image" router from the actual daemon implementation.
Add Daemon.GetNetworkByID and Daemon.GetNetworkByName.
Decouple the "network" router from the actual daemon implementation.
This is done by replacing the daemon.NetworkByName constant with
an explicit GetNetworkByName method.
Remove the unused Daemon.GetNetwork method and the associated constants NetworkByID and NetworkByName.
Signed-off-by: Lukas Waslowski <cr7pt0gr4ph7@gmail.com>
Signed-off-by: David Calavera <david.calavera@gmail.com>
This patch adds the ability to run `docker stats` w/o arguments and get
statistics for all running containers by default. Also add a new
`--all` flag to list statistics for all containers (like `docker ps`).
New running containers are added to the list as they show up also.
Add integration tests for this new behavior.
Docs updated accordingly. Fix missing stuff in man/commandline
reference for `docker stats`.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Currently, we get the network stats each time per subscriber, causing a
high load of cpu when there are several subscribers per container.
This change makes the daemon to collect once and publish N times, where N is the
number of subscribers per container.
Signed-off-by: David Calavera <david.calavera@gmail.com>