Discussion:
[Bug 213611] ports-mgmt/pkg: 1.9.1 pkgdb locking doesn't work on NFS
b***@freebsd.org
2016-10-19 13:16:51 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

Bug ID: 213611
Summary: ports-mgmt/pkg: 1.9.1 pkgdb locking doesn't work on
NFS
Product: Ports & Packages
Version: Latest
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: Individual Port(s)
Assignee: ***@FreeBSD.org
Reporter: ***@gmail.com
Assignee: ***@FreeBSD.org
Flags: maintainer-feedback?(***@FreeBSD.org)

pkg info fails with following error, if /var/db/pkg is on NFS:

pkg: sqlite error while executing iterator in file pkgdb_iterator.c:1033:
database is locked

ktrace:

61912 pkg CALL setgroups(0x1,0x802780abc)
61912 pkg RET setgroups 0
61912 pkg CALL setegid(0xfffe)
61912 pkg RET setegid 0
61912 pkg CALL setgid(0xfffe)
61912 pkg RET setgid 0
61912 pkg CALL seteuid(0xfffe)
61912 pkg RET seteuid 0
61912 pkg CALL setuid(0xfffe)
61912 pkg RET setuid 0
61912 pkg CALL
mkdir(0x80301c2e8,0x1ff<S_IRUSR|S_IWUSR|S_IXUSR|S_IRGRP|S_IWGRP|S_IXGRP|S_IROTH|S_IWOTH|S_IXOTH>)
61912 pkg NAMI "/var/db/pkg/local.sqlite.lock"
61912 pkg RET mkdir -1 errno 13 Permission denied

==> privileges get dropped and only afterwards mkdir
/var/db/pkg/local.sqlite.lock is tried, which has to fail as user "nobody".



Verified with "chmod 777 /var/db/pkg/"

drwxr-xr-x 2 nobody nobody 4096 Oct 19 12:50 local.sqlite.lock



From src/info.c:

322
323 drop_privileges();
324 if (pkgdb_obtain_lock(db, PKGDB_LOCK_READONLY) != EPKG_OK) {
325 pkgdb_close(db);

Probably wrong order of drop_privileges and pkgdb_obtain_lock?


This happens only on NFS because of this in src/pkgdb.c:

1026 /*
1027 * Fall back on unix-dotfile locking strategy if on a network filesystem
1028 */

(...)
1034 sqlite3_vfs_register(sqlite3_vfs_find("unix-dotfile"), 1);


(Our quick workaround was to comment out the whole "fall back on unix-dotfile
on NFS" part. Probably not the best solution ...)
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-10-19 13:16:51 UTC
Permalink
Markus Wennrich <***@gmail.com> has reassigned Bugzilla Automation
<***@FreeBSD.org>'s request for maintainer-feedback to ***@FreeBSD.org:
Bug 213611: ports-mgmt/pkg: 1.9.1 pkgdb locking doesn't work on NFS
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611



--- Description ---
pkg info fails with following error, if /var/db/pkg is on NFS:

pkg: sqlite error while executing iterator in file pkgdb_iterator.c:1033:
database is locked

ktrace:

61912 pkg CALL setgroups(0x1,0x802780abc)
61912 pkg RET setgroups 0
61912 pkg CALL setegid(0xfffe)
61912 pkg RET setegid 0
61912 pkg CALL setgid(0xfffe)
61912 pkg RET setgid 0
61912 pkg CALL seteuid(0xfffe)
61912 pkg RET seteuid 0
61912 pkg CALL setuid(0xfffe)
61912 pkg RET setuid 0
61912 pkg CALL
mkdir(0x80301c2e8,0x1ff<S_IRUSR|S_IWUSR|S_IXUSR|S_IRGRP|S_IWGRP|S_IXGRP|S_IROTH
|S_IWOTH|S_IXOTH>)
61912 pkg NAMI "/var/db/pkg/local.sqlite.lock"
61912 pkg RET mkdir -1 errno 13 Permission denied

==> privileges get dropped and only afterwards mkdir
/var/db/pkg/local.sqlite.lock is tried, which has to fail as user "nobody".



Verified with "chmod 777 /var/db/pkg/"

drwxr-xr-x 2 nobody nobody 4096 Oct 19 12:50 local.sqlite.lock



From src/info.c:

322
323 drop_privileges();
324 if (pkgdb_obtain_lock(db, PKGDB_LOCK_READONLY) != EPKG_OK) {
325 pkgdb_close(db);

Probably wrong order of drop_privileges and pkgdb_obtain_lock?


This happens only on NFS because of this in src/pkgdb.c:

1026 /*
1027 * Fall back on unix-dotfile locking strategy if on a network filesystem
1028 */

(...)
1034 sqlite3_vfs_register(sqlite3_vfs_find("unix-dotfile"), 1);


(Our quick workaround was to comment out the whole "fall back on unix-dotfile
on NFS" part. Probably not the best solution ...)
b***@freebsd.org
2016-10-19 16:03:07 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

--- Comment #1 from Baptiste Daroussin <***@FreeBSD.org> ---
Your workaround is a good one if you have setup correctly nfs locking. The
reason for the fallback is because most users do not.

I will fix asap.
I bet pkg audit has the same issue
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-12-08 09:30:50 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

--- Comment #2 from Baptiste Daroussin <***@FreeBSD.org> ---
Should be fixed, can you confirm? (in 1.9.4)
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-12-14 08:40:10 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

--- Comment #3 from dornergd <***@hob.de> ---
I have tested pkg 1.9.4 and still get the same error in combination with NFS:
"pkg: sqlite error while executing iterator in file pkgdb_iterator.c:1033:
database is locked.

To get more information I activated the debug output:
# pkg -dddd info
DBG(1)[22647]> pkg initialized
DBG(4)[22647]> Pkgdb: running 'PRAGMA user_version;'
DBG(4)[22647]> Pkgdb: executing 'PRAGMA foreign_keys = ON;'
DBG(4)[22647]> Pkgdb: preparing statement 'INSERT OR IGNORE INTO mtree(content)
VALUES(?1)'
...
DBG(4)[22647]> Pkgdb: preparing statement 'INSERT OR IGNORE INTO
requires(require) VALUES(?1)'
DBG(4)[22647]> Pkgdb: running 'SELECT id, origin, name, name as uniqueid,
version, comment, desc, message, arch, maintainer, www, prefix, flatsize,
licenselogic, automatic, locked, time, manifestdigest, vital FROM packages AS p
ORDER BY p.name;'
pkg: sqlite error while executing iterator in file pkgdb_iterator.c:1033:
database is locked

But when I start "pkg shell" and run the "SELECT"-statement there, I get a list
of all my installed packages,

Additional I found out that this problem exists since version 1.9.0.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-02-21 16:43:32 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

Rodney W. Grimes <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org

--- Comment #4 from Rodney W. Grimes <***@FreeBSD.org> ---
I am still seeing this bug with pkg 1.9.4 on aarch64/rpi3.
I well attempt to verify that it still exists on amd64/i386.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-03-02 12:06:29 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

Wolfgang Meyer <***@hob.de> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@hob.de

--- Comment #5 from Wolfgang Meyer <***@hob.de> ---
The problem is still present with 1.10.0
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-03-03 13:30:30 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

iz-***@hs-karlsruhe.de changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |iz-***@hs-karlsruhe.de

--- Comment #6 from iz-***@hs-karlsruhe.de ---
I am investigating the same problem.
NFS lockd and statd are up an running.

As far as I can tell sqlite uses the directory local.sqlite.lock in /var/db/pkg
for dotlocking /var/db/pkg/local.sqlite. But it never deletes it _there_.

Doing "pkg-static info" on an unlocked database shows the expected output
but leaves local.sqlite.lock in /var/db/pkg.

Next time you call pkg or pkg-static you will get the "database is locked"
error unless you are fast enough to remove /var/db/pkg/local.sqlite while
pkg is running.

The following part of ktrace/kdump shows the locking is done with a relative
path. So it happens in /var/db/pkg.
But the unlocking is done with a absolute path. An this path does not match
the relative one used for locking.

95575 pkg-static CALL stat(0xbfbfe150,0xbfbfe0f0)
95575 pkg-static NAMI "/var/db/pkg"
95575 pkg-static STRU struct stat {dev=973143811, ino=913835, mode=040755,
nlink=2, uid=0, gid=0, rdev=0, atime=1488515387.614408874,
mtime=1488543581.397913666, ctime=1488543581.397913666, birthtime=-1,
size=4096, blksize=4096, blocks=8, flags=0x0 }
95575 pkg-static RET stat 0
95575 pkg-static CALL eaccess(0xbfbfe150,0x4<R_OK>)
95575 pkg-static NAMI "/var/db/pkg"
95575 pkg-static RET eaccess 0
95575 pkg-static CALL stat(0xbfbfe150,0xbfbfe0f0)
95575 pkg-static NAMI "/var/db/pkg/local.sqlite"
95575 pkg-static STRU struct stat {dev=973143811, ino=913839, mode=0100644,
nlink=1, uid=0, gid=0, rdev=0, atime=1488508880.607008273,
mtime=1488508814.956928131, ctime=1488508814.956928131, birthtime=-1,
size=64334848, blksize=4096, blocks=125672, flags=0x0 }
95575 pkg-static RET stat 0
95575 pkg-static CALL eaccess(0xbfbfe150,0x4<R_OK>)
95575 pkg-static NAMI "/var/db/pkg/local.sqlite"
95575 pkg-static RET eaccess 0
95575 pkg-static CALL
openat(AT_FDCWD,0x2880f0e0,0x120000<O_RDONLY|O_DIRECTORY|O_CLOEXEC>)
95575 pkg-static NAMI "/var/db/pkg"
95575 pkg-static RET openat 4
95575 pkg-static CALL faccessat(0x4,0x83ecfb0,0x4<R_OK>,0x100)
95575 pkg-static NAMI "local.sqlite"
95575 pkg-static RET faccessat 0
95575 pkg-static CALL statfs(0x2880f0e0,0xbfbfe358)
95575 pkg-static NAMI "/var/db/pkg"
95575 pkg-static RET statfs 0
95575 pkg-static CALL lstat(0x2881f728,0xbfbfe390)
95575 pkg-static NAMI "local.sqlite"
95575 pkg-static RET lstat -1 errno 2 No such file or directory
95575 pkg-static CALL __getcwd(0x28848008,0x1ff)
95575 pkg-static NAMI "/root"
95575 pkg-static RET __getcwd 0
95575 pkg-static CALL getpid
95575 pkg-static RET getpid 95575/0x17557
95575 pkg-static CALL getpid
95575 pkg-static RET getpid 95575/0x17557
95575 pkg-static CALL fstatat(0x4,0x2884d206,0xbfbfdfe0,0)
95575 pkg-static NAMI "local.sqlite"
95575 pkg-static STRU struct stat {dev=973143811, ino=913839, mode=0100644,
nlink=1, uid=0, gid=0, rdev=0, atime=1488508880.607008273,
mtime=1488508814.956928131, ctime=1488508814.956928131, birthtime=-1,
size=64334848, blksize=4096, blocks=125672, flags=0x0 }
95575 pkg-static RET fstatat 0
95575 pkg-static CALL
openat(0x4,0x2884d206,0x100202<O_RDWR|O_CREAT|O_CLOEXEC>,0644<S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH>)
95575 pkg-static NAMI "local.sqlite"
95575 pkg-static RET openat 5
95575 pkg-static CALL fstat(0x5,0xbfbfde88)
95575 pkg-static STRU struct stat {dev=973143811, ino=913839, mode=0100644,
nlink=1, uid=0, gid=0, rdev=0, atime=1488508880.607008273,
mtime=1488508814.956928131, ctime=1488508814.956928131, birthtime=-1,
size=64334848, blksize=4096, blocks=125672, flags=0x0 }
95575 pkg-static RET fstat 0
95575 pkg-static CALL pread(0x5,0xbfbfe460,0x64,0,0)
95575 pkg-static GIO fd 5 read 100 bytes
0x0000 5351 4c69 7465 2066 6f72 6d61 7420 3300 0400 0101 0040 2020 0007
6598 0000 f56b 0000 766c 0000 1565 0000 1e32 |SQLite format 3......@
..e....k..vl...e...2|
[bytes are skipped]
95575 pkg-static RET pread 100/0x64
[lock the database]
95575 pkg-static CALL
mkdirat(0x4,0x2881f76e,0777<S_IRUSR|S_IWUSR|S_IXUSR|S_IRGRP|S_IWGRP|S_IXGRP|S_IROTH|S_IWOTH|S_IXOTH>)
95575 pkg-static NAMI "local.sqlite.lock"
95575 pkg-static RET mkdirat 0
95575 pkg-static CALL fstatat(0x4,0x2884d21a,0xbfbfe058,0)
95575 pkg-static NAMI "local.sqlite-journal"
95575 pkg-static RET fstatat -1 errno 2 No such file or directory
95575 pkg-static CALL fstat(0x5,0xbfbfe058)
95575 pkg-static STRU struct stat {dev=973143811, ino=913839, mode=0100644,
nlink=1, uid=0, gid=0, rdev=0, atime=1488543655.118915624,
mtime=1488508814.956928131, ctime=1488508814.956928131, birthtime=-1,
size=64334848, blksize=4096, blocks=125672, flags=0x0 }
95575 pkg-static RET fstat 0
95575 pkg-static CALL fstatat(0x4,0x2884d235,0xbfbfe058,0)
95575 pkg-static NAMI "local.sqlite-wal"
95575 pkg-static RET fstatat -1 errno 2 No such file or directory
95575 pkg-static CALL fstat(0x5,0xbfbfe058)
95575 pkg-static STRU struct stat {dev=973143811, ino=913839, mode=0100644,
nlink=1, uid=0, gid=0, rdev=0, atime=1488543655.118915624,
mtime=1488508814.956928131, ctime=1488508814.956928131, birthtime=-1,
size=64334848, blksize=4096, blocks=125672, flags=0x0 }
95575 pkg-static RET fstat 0
95575 pkg-static CALL pread(0x5,0x2889ce28,0x400,0,0)
95575 pkg-static GIO fd 5 read 1024 bytes
0x0000 5351 4c69 7465 2066 6f72 6d61 7420 3300 0400 0101 0040 2020 0007
6598 0000 f56b 0000 766c 0000 1565 0000 1e32 |SQLite format 3......@
..e....k..vl...e...2|
[bytes are skipped]
95575 pkg-static RET pread 1024/0x400
[unlock the database]
95575 pkg-static CALL rmdir(0x2881f768)
95575 pkg-static NAMI "/root/local.sqlite.lock"
95575 pkg-static RET rmdir -1 errno 2 No such file or directory
95575 pkg-static CALL utimes(0x2881f768,0)
95575 pkg-static NAMI "/root/local.sqlite.lock"
95575 pkg-static RET utimes -1 errno 2 No such file or directory
95575 pkg-static CALL fstatat(0x4,0x2884d21a,0xbfbfdf08,0)
95575 pkg-static NAMI "local.sqlite-journal"
95575 pkg-static RET fstatat -1 errno 2 No such file or directory
95575 pkg-static CALL fstat(0x5,0xbfbfdf08)


Ralf
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-03-30 09:09:32 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

--- Comment #7 from Wolfgang Meyer <***@hob.de> ---
I am still seeing this bug with pkg 1.10.1.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-05-03 11:28:44 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

--- Comment #8 from Markus Wennrich <***@gmail.com> ---
Just saw in the commit
https://github.com/freebsd/pkg/commit/6ed10aa80256b340b4e5c422438c435e7aa70c45
that you have to set "NFS_WITH_PROPER_LOCKING" for the fix to work.


***@server:/ # pkg info
pkg: sqlite error while executing PRAGMA user_version; in file
pkgdb.c:2442: database is locked

***@server:/ # echo "NFS_WITH_PROPER_LOCKING = true;" >>
/usr/local/etc/pkg.conf

***@server:/ # pkg info
autoconf-2.69_1 Automatically configure source code on many
Un*x platforms
(...)

==> I can confirm, that with "NFS_WITH_PROPER_LOCKING" set, pkg locking on NFS
works as expected.

But honestly, I don't see a reason for keeping the unix-dotfile code, since it
doesn't work with drop_privileges. As far as I can see, everbody who runs into
the unix-dotfile part has to set NFS_WITH_PROPER_LOCKING anyway. No?
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-02-07 15:09:14 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

***@utanet.at changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@utanet.at

--- Comment #9 from ***@utanet.at ---
Feedback from reporter please.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-02-09 07:03:39 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

--- Comment #10 from Markus Wennrich <***@gmail.com> ---
"fixed" with echo "NFS_WITH_PROPER_LOCKING = true;" >> /usr/local/etc/pkg.conf
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-02-09 07:03:57 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

Markus Wennrich <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|New |Closed

--- Comment #11 from Markus Wennrich <***@gmail.com> ---
"fixed" with echo "NFS_WITH_PROPER_LOCKING = true;" >> /usr/local/etc/pkg.conf
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-02-12 19:53:02 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

Michael Dexter <***@callfortesting.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@callfortesting.org

--- Comment #12 from Michael Dexter <***@callfortesting.org> ---
I am seeing what might be a related issue with pkg 1.10.3 with an NFS booted
system.

Scenario:

Populate the client's /var/cache/pkg with a known-good set of packages,
including pkg itself.

pkg-static add -f /var/cache/pkg/pkg-1.10.3_1.txz

pkg-static: sqlite error while executing PRAGMA user_version; in file
pkgdb.c:2446: disk I/O error

I have NFS_WITH_PROPER_LOCKING = "true"; in /usr/local/etc/pkg.conf

I have lockd running on the NFS server.

Adding a tmpfs mount to /var/db/pkg allows it to work exactly as expected.

Any suggestions?
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-11-22 22:06:45 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213611

Oleksandr Tymoshenko <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org

--- Comment #13 from Oleksandr Tymoshenko <***@FreeBSD.org> ---
(this is high on Google search results page, so updating with the lates state
of the problem)

NFS_WITH_PROPER_LOCKING was removed:
https://github.com/freebsd/pkg/commit/655ddffae6ae4adc2820736a33736a99dea8dfc3

I was able to fix NFS locking problem by enabling lockd on *both* server and
client:

sysrc rpc_lockd_enable=YES
service lockd start
--
You are receiving this mail because:
You are the assignee for the bug.
Loading...