Skip to content

Conversation

@evan-goode
Copy link
Member

Related: #2129

This is another step towards proper locking so that multiple DNF5 processes don't interfere with each other.

Adds Base::lock_system_repo and Base::unlock_system_repo:

/// Acquire an advisory lock on the installroot's system repository.
/// The lock will be automatically released when Base goes out of scope, or manually when unlock_system_repo is called.
/// Can be called multiple times to upgrade or downgrade a READ lock to a WRITE lock or vice versa.
/// Should be called before the system repo is loaded, and the lock should be held until all transactions are
/// complete and other processes can safely re-read the RPMDB and resolve transactions.
/// @throw libdnf5::SystemError if an unexpected error occurs when locking
/// @return true if acquiring the lock succeeded, false otherwise
bool lock_system_repo(
    libdnf5::utils::LockAccessType access = libdnf5::utils::LockAccessType::WRITE,
    libdnf5::utils::LockBlockingType blocking = libdnf5::utils::LockBlockingType::NON_BLOCKING);

/// Release the lock obtained by lock_system_repo.
/// Idempotent. No-op if there is currently no lock.
/// @throw libdnf5::SystemError if an unexpected error occurs when unlocking
void unlock_system_repo();

And calls Base::lock_system_repo from Context::load_repos before the system repo is loaded.

The idea here is that we need to hold a lock the entire time from before the transaction is resolved until after the transaction is complete and the RPMDB is written. Otherwise, another DNF process could resolve a transaction using a soon-to-be-invalid version of the system repo and get an invalid transaction.

This approach is an improvement over DNF4 which blocks while another transaction is running but does not block before loading the system repo and resolving a new transaction.

A helpful message is printed listing the other processes DNF5 is waiting for:

$ sudo dnf5 install rsms-inter-fonts
Waiting for other processes to finish:
62409	dnf5 install blender

Blocks until the lock is acquired, and prints a table to stderr
listing the other processes we're waiting for:

$ sudo dnf5 install rsms-inter-fonts
Waiting for other processes to finish:
62409	dnf5 install blender

For now, always acquire a write lock. In the future, we could perhaps
add a way to tell the context whether write operations are allowed. See
also cmd_requires_privileges.
@evan-goode evan-goode requested a review from a team as a code owner November 17, 2025 23:05
@evan-goode evan-goode requested review from kontura and removed request for a team November 17, 2025 23:05
@evan-goode evan-goode marked this pull request as draft November 17, 2025 23:05
Add a new flag to Context to set whether we plan on modifying the system repo.

If will_modify_system is false, then Context::load_repos will acquire a
READ lock on the system repository before loading it. If
will_modify_system is true (the default), then a WRITE lock will be
acquired.
@evan-goode evan-goode marked this pull request as ready for review November 21, 2025 01:36
@evan-goode evan-goode marked this pull request as draft November 21, 2025 01:41
@evan-goode
Copy link
Member Author

Now the non-privileged commands acquire a READ lock, not a WRITE lock.

I'm still trying to figure out how this should work for unprivileged users. If a root DNF process has not yet created /run/dnf, then an unprivileged user doesn't have sufficient permissions to create /run/dnf and they can't obtain any lock, either read or write. So I'm thinking the lockfile needs to be persistent somewhere, always owned by root with mode 0664 or 0644. That way unprivileged users will always be able to open a O_RDONLY FD, but they won't be able to delete the lockfile or write garbage to it.

Maybe we could put it in /usr/lib/sysimage/libdnf5? That should work with installroots too.

We'd also need to extend Locker to not delete the lockfiles, that will likely be an ABI-breaking change.

We want the system-repo.lock to be persisted on the filesystem so that
unprivileged users can acquire a read lock on it without needing the
permissions to create it.

This is an ABI-breaking change, so this commit also bumps the version to
5.4.0.0.
Moves system-repo.lock from /run/dnf/system-repo.lock to
/usr/lib/sysimage/libdnf5/system-repo.lock and makes it persistent and
provided by libdnf5. This way, the file can always be owned by root and
other unprivileged users can open it for reading and obtain a read lock
without needing permissions to create the file.

Also adds Base::get_system_repo_lock() to get a pointer to the Locker
instance and renames LockAccessType to LockAccess and LockBlockingType
to LockBlocking.
@evan-goode
Copy link
Member Author

I've moved the lockfile to /usr/lib/sysimage/libdnf5/system-repo.lock and refactored Locker with a p_impl.

Now I think we just need some flag to ignore the locks in the case where e.g. a privileged user doesn't care about an unprivileged user's read lock on the system repo.

The option is called "skip_system_repo_lock" as opposed to
"ignore_system_repo_lock" since we don't even try to acquire a lock if
skip_system_repo_lock is enabled. This way, the option is useful for
situations where the unprivileged user may not have permissions to
create the lockfile, e.g. in a root-owned installroot that does not
contain /usr/lib/sysimage/libdnf5/system-repo.lock.
This is an API-breaking change. This way we can use Context::load_repos
even when available repositories are not needed, instead of using
RepoSack::load_repos. Context::load_repos has the logic for locking the
system repo.
@evan-goode evan-goode marked this pull request as ready for review November 25, 2025 01:51
@evan-goode
Copy link
Member Author

I think this is in a good place for initial review. I ended up making an API-breaking change to add a load_system flag to Context::load_repos.

@kontura kontura self-assigned this Dec 1, 2025
Copy link
Contributor

@kontura kontura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't think these breaking changes will actually break anyone's code I am not thrilled about bumping the major version so soon again.

I think we could get around it by rather introducing new API but I am not sure if there is any real technical reason for this.


bool load_system_repo{false};
LoadAvailableRepos load_available_repos{LoadAvailableRepos::NONE};
bool will_modify_system{true};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this value is derived from (it duplicates) information already present in the Context. Is there a plan to use it somehow differently later on?

Currently I would rather make cmd_requires_privileges(dnf5::Context context) a private method of Context and use it directly. It would require introducing a private header file for Context, where we would likely move Context::Impl, but I think that would be a good move regardless.
What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful in the future for commands (especially those provided by plugins) to declare for themselves whether they will modify the system rather than doing the heuristics in cmd_requires_privileges. Also if we ever do something about https://issues.redhat.com/browse/RHEL-67915, there would be a distinction between "will modify the system repository" (inside the installroot) and "needs root privileges".

But for now your suggestion is maybe better, like you said, a context_private.hpp would be a good idea.


/// Sets callbacks for repositories and loads them, updating metadata if necessary.
void load_repos(bool load_system);
void load_repos(bool load_system, bool load_available);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are doing changes here, I am wondering whether the Context.load_repos(...) API should be public at all. If I am not mistaken this API is for dnf5 plugins and those will already be using it through the dnf5 main function.
I see it is now used in makecache command but that feels more like an oversight to me because it results in calling it twice. I believe it could easily be removed.

This would need a private header for Context as welll though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it could be made private.

@evan-goode
Copy link
Member Author

While I don't think these breaking changes will actually break anyone's code I am not thrilled about bumping the major version so soon again.

I think we could get around it by rather introducing new API but I am not sure if there is any real technical reason for this.

Most of the ABI/API changes are in Locker. So we could create a LockerV2 or something, yes. IMO it's better to just do the ABI/API break. In my view, there is not much harm in a major version bump other than the ABI/API break that it signifies. Stable F43 is still on DNF 5.2 so we would not need to do backporting to 5.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants