Permissions Users Groups Capabilities and LSMs

Reading time
15 min read
Word count
2801 words
Diagram count
2 diagrams

Source: Victor Bona's Obsidian Compendium snapshot, Knowledge base/linux-systems-engineering/08 Permissions Users Groups Capabilities and LSMs.md.

Purpose: Build a production-focused model of Linux identity, permissions, privilege boundaries, kernel isolation primitives, LSMs, host secrets, patching, and incident response, with explicit separation between local learning machines and hardened production hosts or clusters.

08 Permissions Users Groups Capabilities and LSMs

Related notes: 06 System Calls ABI libc and User Kernel Boundaries, 07 systemd Boot Init Units Timers Journald and Services, 09 cgroups Namespaces Containers and Runtime Isolation, 17 Production Operations Troubleshooting and Runbooks

Linux security is layered. The basic discretionary access control model uses users, groups, mode bits, ownership, and process credentials. Privilege is then split and constrained through capabilities, setuid and setgid semantics, namespaces, cgroups, seccomp, and Linux Security Modules. Production hardening is not one feature. It is the discipline of making every layer deny something useful to an attacker while preserving operability.

On a local learning machine, you can safely inspect /etc/passwd, create throwaway users, experiment with chmod, use containers to learn namespaces, and test AppArmor or SELinux in permissive modes. On production hosts and cluster nodes, identity and policy changes are change-controlled because a single group membership, sudo rule, capability, or LSM label can become a root path.

Users, groups, UID, and GID

The kernel mostly reasons about numeric IDs. Names are userspace mapping conveniences from sources such as /etc/passwd, /etc/group, LDAP, SSSD, or another NSS provider.

ConceptKernel-facing meaningOperational guidance
UIDnumeric user identitystable service UIDs matter for file ownership and audit trails
GIDprimary group identitydo not overload one shared group for unrelated services
supplementary groupsadditional group credentials attached at login or exec contextreview because they silently grant file and device access
effective UIDidentity used for many permission checkssetuid and sudo change this boundary
filesystem UIDidentity used for filesystem checks on Linuxusually tracks effective UID but matters in privileged code

Commands:

id
id example
getent passwd example
getent group example
groups example
find / -xdev -uid 1001 -ls
find / -xdev -gid 1001 -ls

Production guidance:

  • use dedicated service users for daemons
  • avoid shared writable directories across service users
  • keep human accounts separate from service accounts
  • prefer centrally managed identities for fleets, but know the local break-glass path
  • audit supplementary groups such as docker, wheel, sudo, adm, systemd-journal, disk, video, and hardware-specific groups

Membership in groups such as docker or disk can be equivalent to root in practice. Treat them as privileged access, not convenience.

File permissions

Traditional mode bits are simple and sharp:

-rwxr-x---  owner group other

For files, read means read bytes, write means modify bytes, execute means execute as a program or script. For directories, read means list names, write means create or remove names, and execute means traverse. Directory execute without read lets a process access known names without listing the directory.

stat /etc/passwd
namei -l /var/lib/example/data.db
chmod 0640 file
chown example:example file
chgrp example file

Special mode bits matter:

BitOn filesOn directoriesRisk
setuidexecute with file owner effective UIDusually ignoredroot-owned setuid binaries are high-value escalation targets
setgidexecute with file group effective GIDnew files inherit directory groupuseful for shared project dirs, risky with broad write
stickyrarely usefulonly owner, dir owner, or root can delete entriesrequired for shared temp dirs like /tmp

Find sensitive bits:

find / -xdev -perm -4000 -type f -ls
find / -xdev -perm -2000 -type f -ls
find / -xdev -perm -0002 ! -perm -1000 -type d -ls

Production systems should have an expected inventory of setuid and setgid files. A new root-owned setuid file is an incident until explained.

umask

umask subtracts permissions from newly created files and directories. A process that creates files with mode 0666 under umask 0027 gets 0640. A process that creates directories with mode 0777 under umask 0027 gets 0750.

umaskFile result from 0666Directory result from 0777Use
002206440755common default, world-readable
002706400750production service default for private group access
007706000700secrets and user-private files
000206640775collaborative group directories

For services, set UMask= in the systemd unit rather than relying on login shell defaults. See 07 systemd Boot Init Units Timers Journald and Services for unit-level controls.

sudo

sudo is a policy engine for controlled command execution, not only a root wrapper. A sudo rule defines who may run what, as which user or group, from which host, with which authentication and environment handling.

Production rules:

  • edit with visudo
  • prefer command-specific rules over ALL=(ALL) ALL
  • avoid writable scripts in sudo command paths
  • avoid wildcards unless every expansion is understood
  • reset or tightly preserve environment
  • log sudo use centrally
  • require MFA or short-lived elevation for sensitive fleets when available

Common mistake:

operator ALL=(root) NOPASSWD: /usr/local/bin/backup *

If backup accepts arbitrary paths, config files, shell escapes, or plugin loading, this may be root. The safe rule is not only about the sudoers line. It is about the called program's complete input surface.

PAM overview

PAM, the Pluggable Authentication Modules framework, lets services compose authentication, account checks, password changes, and session setup. Login, sshd, sudo, su, display managers, and many other services can have separate PAM stacks.

PAM phasePurposeExamples
authprove identitypassword, FIDO, smart card, Kerberos
accountdecide whether access is allowedexpiry, time restrictions, host rules
passwordupdate credentialspassword quality and history
sessionsetup and teardownlimits, keyrings, home mounts, audit sessions

Production caution: PAM changes can lock out administrators. Test through a second root session or console, stage changes, and know the recovery path. In clusters, PAM controls node login, not application pod identity unless explicitly integrated.

Capabilities

Linux capabilities split traditional root privilege into named units such as CAP_NET_BIND_SERVICE, CAP_SYS_ADMIN, CAP_SYS_PTRACE, and CAP_SYS_MODULE. Capabilities are per-thread attributes, and file capabilities can grant capabilities at exec time.

Inspect:

capsh --print
getpcaps $$
getcap -r /usr/bin /usr/sbin 2>/dev/null
grep Cap /proc/$$/status

Grant low-port bind without full root:

sudo setcap cap_net_bind_service=+ep /usr/local/bin/example
getcap /usr/local/bin/example

Production guidance:

  • prefer running as non-root with one narrow capability over root
  • avoid CAP_SYS_ADMIN; it is intentionally broad and often container-escape relevant
  • inventory file capabilities like setuid binaries
  • remove capabilities from interpreters and writable deployment paths
  • combine capabilities with systemd CapabilityBoundingSet=, AmbientCapabilities=, NoNewPrivileges=yes, and LSM policy

Capabilities are not a sandbox. A process with CAP_SYS_PTRACE, CAP_SYS_MODULE, CAP_DAC_OVERRIDE, or CAP_SYS_ADMIN can often cross boundaries that look separate at the file-permission layer.

setuid and setgid

setuid and setgid are old privilege transition mechanisms. They remain necessary for some core utilities, but they concentrate risk because the executable starts with elevated effective credentials.

Production handling:

  • minimize the installed setuid inventory
  • prefer capabilities or brokered privileged helpers where possible
  • ensure setuid binaries are root-owned and not writable by group or other
  • avoid setuid scripts
  • watch for setuid copies in writable paths
  • include setuid inventory in baseline integrity monitoring

Incident command:

find / -xdev \( -perm -4000 -o -perm -2000 \) -type f -printf '%m %u %g %p\n'

seccomp

seccomp filters reduce the system call surface available to a process. They are useful because many processes need only a small subset of kernel entry points. The kernel documentation is clear that seccomp filtering is not a complete sandbox; it is a primitive used by sandboxes and hardening profiles.

Common places seccomp appears:

  • container runtimes
  • browsers
  • systemd SystemCallFilter=
  • language runtimes or security wrappers
  • high-risk parsers and media processors

Production guidance:

  • start from known profiles rather than hand-writing broad deny lists
  • test under representative workload, including DNS, TLS, locale, time, signals, file rotation, and crash handling
  • log denied syscalls where possible during rollout
  • prefer allow-list thinking, but avoid breaking emergency diagnostics
  • combine with NoNewPrivileges=yes when installing filters as unprivileged code

Example systemd hardening:

[Service]
NoNewPrivileges=yes
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM

LSMs: AppArmor, SELinux, Landlock, and friends

Linux Security Modules add mandatory access control and other security hooks. The active LSM list is visible at:

cat /sys/kernel/security/lsm

The capability module is always present. Distros may enable minor modules such as Yama or Landlock and one major MAC module such as AppArmor or SELinux, depending on kernel configuration and boot parameters.

Rendering diagram...

AppArmor

AppArmor is profile-oriented. Policy is usually attached to executable paths and describes what a confined program may access. It is common on Ubuntu and SUSE-derived systems.

Useful commands:

aa-status
sudo aa-complain /etc/apparmor.d/usr.sbin.example
sudo aa-enforce /etc/apparmor.d/usr.sbin.example
journalctl -k -g apparmor

Production guidance:

  • use complain mode to learn denials before enforcing new profiles
  • keep profiles with package or config management
  • remember unprofiled tasks run under normal DAC unless another LSM confines them
  • path-based policy can be easier to operate, but renames, bind mounts, and alternate paths require attention

SELinux

SELinux is label-oriented. Subjects and objects have security contexts, and policy decides allowed interactions. It is common on Fedora, RHEL, CentOS Stream, and derivatives.

Useful commands:

getenforce
sestatus
ls -Z /var/www
ps -eZ | head
sudo ausearch -m avc -ts recent
sudo restorecon -Rv /var/www

Production guidance:

  • do not disable SELinux to fix an application; identify the denied action and correct labels or policy
  • understand the difference between permissive and enforcing modes
  • use distro-provided policy as the baseline
  • label persistence matters; chcon is quick, semanage fcontext plus restorecon is durable
  • AVC denials are evidence, but not every denial is the root cause

Landlock

Landlock is a stackable LSM intended to let processes, including unprivileged processes, restrict their own future access. It is useful for application-level sandboxing where a process voluntarily gives up ambient filesystem or network rights. It does not replace system-wide MAC policy because it is usually applied by the program itself or its launcher.

Check for evidence:

dmesg | grep -i landlock || journalctl -kb -g landlock
cat /sys/kernel/security/lsm

Production use is strongest when application code or a trusted launcher applies Landlock rules early, before parsing untrusted input.

Namespaces as isolation

Namespaces partition what a process can see: PIDs, mounts, UTS hostname, IPC, network, users, cgroups, and time. Containers are mostly composed from namespaces plus cgroups plus capabilities plus seccomp plus LSM policy.

Inspect:

lsns
readlink /proc/$$/ns/user
readlink /proc/1/ns/mnt
nsenter --target 1 --mount --uts --ipc --net --pid

Production guidance:

  • namespaces isolate views, not all kernel attack surface
  • user namespaces change capability meaning inside the namespace, but host-level effects still depend on mappings and kernel checks
  • mount namespaces require careful propagation settings
  • PID namespaces hide process IDs but do not provide resource limits
  • network namespaces isolate network stacks, but host bridges, CNI, iptables, nftables, and eBPF still matter

Cgroups as control

cgroups control and account for resources. They are not primarily a secrecy boundary. They answer questions such as how much CPU, memory, IO, process count, or device access a group of processes may use.

Inspect:

systemd-cgls
systemd-cgtop
cat /proc/$$/cgroup
systemctl show example.service -p ControlGroup -p MemoryCurrent -p TasksCurrent

Production guidance:

  • use cgroups to prevent noisy-neighbor failure
  • combine cgroup limits with alerts, because enforced limits can become outages
  • on Kubernetes nodes, avoid manual changes under pod cgroups unless debugging
  • for systemd services, prefer unit-level resource directives over direct cgroup filesystem writes

chroot limitations

chroot changes the apparent root directory for a process. It is not a full security boundary. A privileged process can often escape, and chroot does not isolate PIDs, network, IPC, mounts outside the setup, hostname, resource usage, or kernel attack surface.

Use chroot for build roots, recovery, packaging, and legacy workflows. For isolation, use namespaces, cgroups, seccomp, capabilities, and LSM policy together. On production systems, any design that says "secured by chroot" needs review.

Secrets on Linux

Secrets leak through more paths than teams expect: command arguments, environment variables, shell history, world-readable config, core dumps, debug endpoints, process inspection, logs, backups, and container image layers.

Safer patterns:

  • store root-owned secret files as 0600 or service-group-readable as 0640
  • put secrets under directories that are not listable by unrelated users
  • use tmpfs for short-lived material when appropriate
  • disable or restrict core dumps for secret-handling services
  • avoid passing secrets on command lines
  • avoid long-lived secrets in environment variables for high-risk services
  • use a real secret manager for rotation, audit, and revocation
  • make backup encryption and restore access part of the secret model

Commands:

find /etc /var/lib -xdev -type f -perm -004 -name '*secret*' -ls 2>/dev/null
grep -R --exclude-dir=.git -n 'BEGIN .*PRIVATE KEY' /etc /opt 2>/dev/null
coredumpctl list

For systemd services, see 07 systemd Boot Init Units Timers Journald and Services for environment files and unit hardening.

SSH hardening

SSH is often the production host control plane. Harden it like an internet-facing API even on private networks.

Baseline /etc/ssh/sshd_config direction:

PermitRootLogin no
PasswordAuthentication no
KbdInteractiveAuthentication no
PubkeyAuthentication yes
AllowGroups ssh-admins
X11Forwarding no
AllowTcpForwarding no
PermitTunnel no
ClientAliveInterval 300
ClientAliveCountMax 2

Validate before reload:

sudo sshd -t
sudo systemctl reload sshd
journalctl -u sshd -b

Production guidance:

  • keep an active session open while changing sshd
  • use short-lived certificates or managed keys where available
  • remove stale keys during offboarding
  • monitor failed login patterns
  • avoid agent forwarding to untrusted hosts
  • document console or out-of-band recovery

On a local learning machine, password SSH may be acceptable on a private lab network. On production hosts, password login should be the rare exception.

auditd and evidence

auditd records security-relevant events from the kernel audit subsystem. It is useful for authentication, SELinux AVCs, watched files, exec events, and incident timelines. It can also be noisy and expensive if configured without focus.

Commands:

sudo auditctl -s
sudo auditctl -l
sudo ausearch -m USER_LOGIN,USER_AUTH -ts today
sudo ausearch -m AVC -ts recent
sudo aureport --auth
sudo aureport --exec

Example watch:

sudo auditctl -w /etc/sudoers -p wa -k sudoers_changes
sudo ausearch -k sudoers_changes

Production guidance:

  • define high-signal watches for identity, sudoers, SSH config, service units, package manager state, and sensitive app config
  • forward audit logs off-host
  • protect audit log retention
  • test performance before enabling broad exec logging on busy hosts
  • treat missing logs during a suspected incident as evidence itself

Kernel lockdown and module signing

Kernel lockdown restricts interfaces that allow root to modify the running kernel or read sensitive kernel memory. It is often associated with Secure Boot and has integrity and confidentiality implications. Module signing verifies that loadable kernel modules are signed by trusted keys when enforcement is configured.

Inspect:

cat /sys/kernel/security/lockdown 2>/dev/null || true
mokutil --sb-state 2>/dev/null || true
cat /proc/sys/kernel/modules_disabled
lsmod
modinfo module_name
journalctl -k -g lockdown

Production guidance:

  • use signed modules on Secure Boot fleets
  • avoid out-of-tree modules unless ownership, patch cadence, and signing are clear
  • know whether eBPF, kprobes, perf, hibernation, debugfs, or kexec workflows are affected by lockdown
  • never disable Secure Boot or lockdown during an incident without preserving why and who approved it
  • for cluster nodes, kernel module policy affects CNI, CSI, eBPF observability, GPU drivers, and security agents

Supply chain considerations

Linux host compromise often arrives through the supply chain: packages, repositories, curl-piped scripts, container images, language package managers, kernel modules, CI artifacts, or vendor agents.

Production controls:

  • pin trusted repositories and verify GPG key ownership
  • prefer distro packages for security-sensitive base components
  • review third-party install scripts before execution
  • restrict who can add apt, dnf, yum, zypper, pacman, snap, or flatpak sources
  • scan container images and host packages for known CVEs
  • track SBOMs or at least package manifests for critical images
  • use reproducible image builds where feasible
  • monitor unexpected changes under /usr/local/bin, /opt, systemd unit paths, and shell profile directories

Local learning machines can tolerate experimental package sources. Production hosts should have a small, explainable package trust root.

Patching and CVE response

Patch response is an operational process, not only a package command.

Rendering diagram...

Commands vary by distro:

uname -a
cat /etc/os-release
apt list --upgradable 2>/dev/null
dnf updateinfo list security 2>/dev/null
yum updateinfo list security 2>/dev/null
zypper list-patches 2>/dev/null
pacman -Qu 2>/dev/null
needrestart 2>/dev/null

Production guidance:

  • classify by exposure, exploitability, privilege required, and asset criticality
  • patch test rings before broad rollout when time allows
  • reboot when kernel, libc, OpenSSL, systemd, container runtime, or critical daemons require it
  • verify that the running process is using the patched binary or library
  • in clusters, coordinate node drains, disruption budgets, and control-plane safety
  • document accepted risk when patching is deferred

Incident response commands

Use commands that preserve evidence before changing state. Prefer read-only inspection first.

Identity and login:

who
w
last -a | head -50
lastlog | head
faillock --user example 2>/dev/null
journalctl -u sshd -b

Processes and network:

ps auxwwf
pstree -aps
ss -tulpn
lsof -nP -i 2>/dev/null
lsns
systemd-cgls

Persistence:

systemctl list-unit-files
systemctl list-timers --all
systemctl list-units --failed
find /etc/systemd/system /usr/local/bin /opt -xdev -type f -mtime -7 -ls 2>/dev/null
crontab -l
ls -la /etc/cron* /var/spool/cron 2>/dev/null

Privilege and policy:

getent passwd
getent group sudo
getent group wheel
sudo -l -U example
find / -xdev -perm -4000 -type f -ls
getcap -r / 2>/dev/null
cat /sys/kernel/security/lsm
getenforce 2>/dev/null
aa-status 2>/dev/null

Logs and kernel evidence:

journalctl -b
journalctl --list-boots
journalctl -p warning..alert -b
journalctl -k -b
dmesg -T
ausearch -ts recent 2>/dev/null

Package and file integrity:

dpkg -V 2>/dev/null
rpm -Va 2>/dev/null
find / -xdev -type f -mtime -1 -ls 2>/dev/null

Production caution: commands such as killing processes, deleting files, rotating logs, rebooting, or patching can destroy volatile evidence. In a serious incident, capture memory, disk, logs, and cloud control-plane evidence according to the incident runbook before containment actions that alter state.

Common mistakes

MistakeWhy it hurtsBetter practice
adding a user to docker for convenienceDocker control can often become host rootuse rootless containers or tightly controlled admin path
fixing permission errors with chmod -R 777destroys confidentiality and integrityidentify exact UID, GID, mode, ACL, or LSM denial
disabling SELinux or AppArmor globallyremoves a whole mandatory control layerfix labels, profiles, or policy
granting CAP_SYS_ADMIN to a containernear-root kernel attack surfacegrant narrow capabilities or redesign
storing secrets in unit environmenteasy process and log leakageuse secret manager, credentials, or protected files
assuming chroot is a sandboxprivileged escape paths remainuse namespaces, cgroups, seccomp, and LSMs
relying only on sudo logsmisses direct root, setuid, service, and key misusecombine auditd, journal, SSH logs, and file integrity
patching without restart validationvulnerable code may still be mappedcheck running processes and reboot requirements

Local vs production operating stance

AreaLocal learning machineProduction host or cluster
Users and groupscreate and delete freely for practicemanaged identities, reviewed group grants
Permissionsexperiment with mode bits and ACLsleast privilege, baseline scans, no broad recursive fixes
sudolearn rule syntax in a VMcommand-specific, logged, MFA or approval where possible
Capabilitiestest with throwaway binariesinventory and pair with bounding sets
LSMspermissive or complain mode is usefulenforcing mode with documented exceptions
Namespacesuse unshare and containers to learnunderstand runtime defaults and escape surface
Secretslocal .env files can be acceptablerotation, audit, protected storage, no shell history
Patchingupdate when convenientseverity-driven rings, reboot coordination, evidence
Incident responsepractice commandspreserve evidence and follow runbooks

The field rule: root is not a design pattern. Every privilege grant should be narrow, observable, revocable, and tied to a named operational need.

Reference URLs