[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#9037) observing crash in mdb_cursor_put()



hyc@symas.com wrote on 2019-06-17 05:25:
> grobins@pulsesecure.net  wrote:
>> I am seeing LMDB crash, please find the stack trace.
>>
>> #0  0x009e362e in mdb_cursor_put (mc=0xffd59bb8, key=0xffd59d14,
>> data=0xffd59d0c, flags=0) at mdb.c:6688
>> #1  0x009e48ec in mdb_put (txn=0xd74d3008, dbi=2, key=0xffd59d14,
>> data=0xffd59d0c, flags=0) at mdb.c:8771
>> â?¦
>>
>> Anybody has seen similar issue before?
> Doesn't sound familiar. But 0.9.18 is quite old. If you can reproduce this
> issue in 0.9.23 then we'll take a look. Test code to reproduce the problem
> would also be needed.
I'm seeing this in Firefox, which uses LMDB 0.9.23 (with minor changes). 
Line 6688 in 0.9.18 occurs at line 6938 in 0.9.23 (line 6937 in Firefox 
since we landed ITS#9030), and that's where we see crashes.

I haven't reported it here yet because I haven't been able to confirm 
that it's a bug in LMDB as opposed to my own code. In fact I haven't 
been able to reproduce it at all, I've only seen it in crash reports 
submitted by Firefox installations (almost exclusively on Windows). So I 
don't have test code to reproduce the problem.

Nevertheless, FWIW, here's the Firefox bug that tracks the issue: 
https://bugzilla.mozilla.org/show_bug.cgi?id=1538541. And here are its 
crash reports: 
https://crash-stats.mozilla.org/signature/?signature=mdb_cursor_put 
(only the last seven days of reports shown by default, but this has been 
happening since we started using LMDB in Firefox nightly builds a couple 
of months ago).

I've examined some of the dumps, and mc->mc_top is 0 when the crash 
occurs, while mc->mc_pg[0] is a NULL pointer. So presumably the crash 
occurs because IS_LEAF2 tries to dereference mc->mc_pg[mc->mc_top].

Further investigation shows that insert_data and insert_key are both 
MDB_NOTFOUND, and flags is 0, so it isn't MDB_CURRENT, nor does it 
contain MDB_APPEND. If I understand the code in mdb_cursor_put 
correctly, this means that mdb_cursor_set was called on line 6614.

And mdb_cursor_set is in the stack of another crash I've been 
investigating in mdb_page_search_root 
(https://bugzilla.mozilla.org/show_bug.cgi?id=1550174, 
https://crash-stats.mozilla.org/signature/?signature=mdb_page_search_root), 
which happens on all of Firefox's primary platforms (Windows, macOS, Linux).

But I haven't been able to reproduce that one either, on any of those 
platforms, so I have no idea if they're related (nor if mdb_cursor_set 
is even implicated in this crash). And I still can't say that either is 
an LMDB bug.

-myk