• crash :(

    From apam@21:1/125 to ALL on Sat Feb 10 12:44:09 2018
    My BBS crashed today, and I have no idea why. Well it sort of crashed, it
    was still running, but something crashed because there was a linux
    backtrace in the screen.

    I can't for the life of me figure out what crashed or why. I think it may
    have been a webserver thread.

    Also someone logged in and attempted to download a file then just sat
    there for 4 hours until their time ran out. Which is suspisious. Either something went awry with the download and it sat there for ever, or they
    sat there pressing the keyboard so as not to cause an inactive timeout.

    If I could reproduce it I could probably fix it...

    Oh well. Don't you hate when that happens.

    Andrew


    --- MagickaBBS v0.10alpha (Linux/x86_64)
    * Origin: Exotica BBS - telnet://exoticabbs.com:2023/ (21:1/125)
  • From NuSkooler@21:1/121 to apam on Fri Feb 9 19:47:54 2018
    I can't for the life of me figure out what crashed or why. I think it may have been a webserver thread.

    If there is a backtrace/stack, what's the issue with finding the bug?



    --- ENiGMA 1/2 v0.0.9-alpha (linux; x64; 8.9.4)
    * Origin: Xibalba -+- xibalba.l33t.codes:44510 (21:1/121)
  • From apam@21:1/125 to NuSkooler on Sat Feb 10 13:00:47 2018
    If there is a backtrace/stack, what's the issue with finding the bug?

    I couldn't make sense of it.

    In my mind a corrupted double-linked list would indicate something like
    I'm writing memory somewhere I shouldn't right? The fact that it's got
    gnutls in there suggests to me that it's caused by the webserver as I
    think libmicrohttpd is the only thing that uses that.

    Going down to the magicka addresses the first one is in the record_last10_callers which I can't see a problem with..

    Andrew

    *** Error in `./magicka': corrupted double-linked list:
    0x0000000001d01f60 ***
    ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f1d6fb2b7e5] /lib/x86_64-linux-gnu/libc.so.6(+0x80baf)[0x7f1d6fb34baf] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f1d6fb3853c] /usr/lib/x86_64-linux-gnu/libtasn1.so.6(+0xb2ba)[0x7f1d6dc6b2ba] /usr/lib/x86_64-linux-gnu/libtasn1.so.6(asn1_delete_structure2+0x7a)[0x7f1 d6dc6c4ba]
    /usr/lib/x86_64-linux-gnu/libgnutls.so.30(+0x44738)[0x7f1d6ef3f738] /lib64/ld-linux-x86-64.so.2(+0x10de7)[0x7f1d7111ade7] /lib/x86_64-linux-gnu/libc.so.6(+0x39ff8)[0x7f1d6faedff8] /lib/x86_64-linux-gnu/libc.so.6(+0x3a045)[0x7f1d6faee045]
    ./magicka[0x40d39b]
    ./magicka[0x40eaef]
    ./magicka[0x40ed41]
    ./magicka[0x413615]
    ./magicka[0x413eb9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f1d6fad4830] ./magicka[0x405149]


    --- MagickaBBS v0.10alpha (Linux/x86_64)
    * Origin: Exotica BBS - telnet://exoticabbs.com:2023/ (21:1/125)
  • From tenser@21:1/112 to apam on Fri Feb 9 23:01:25 2018
    On 02/10/18, apam said the following...

    If there is a backtrace/stack, what's the issue with finding the bug?

    I couldn't make sense of it.

    In my mind a corrupted double-linked list would indicate something like I'm writing memory somewhere I shouldn't right? The fact that it's got gnutls in there suggests to me that it's caused by the webserver as I think libmicrohttpd is the only thing that uses that.

    It could be any number of things. The actual bug, given the stack trace
    you posted, is in libc; probably in the LIST_$FOO macros from queue(3)
    (see /usr/include/sys/queue.h: lifted from BSD but used pretty much everywhere). Maybe Ulrich did his own, though; glibc reinvents everything
    else so who knows. My guess would be some sort of corruption in the ASN.1 handling in GNU TLS, which is likely a web server issue; possibly
    certificate related.

    You can see where in your binary it's coming from by running,
    `objdump -CD /path/to/magicka` and looking for the addresses from the
    magicka binary. Alternately, `addr2line -e /path/to/your/bin 0xdeadbeef`
    while tell you what line of code those PCs correspond to.

    Going down to the magicka addresses the first one is in the record_last10_callers which I can't see a problem with..

    Sometimes the corruption can be a while back; suppose you blow up the
    stack, your program may fault on return and jump to some random location
    in libc (or wherever).

    I'd look for a buffer overwrite of a stack variable.

    --- Mystic BBS v1.12 A38 2018/01/01 (Windows/32)
    * Origin: ACiD Telnet HQ / blackflag.acid.org (21:1/112)
  • From tenser@21:1/112 to apam on Fri Feb 9 23:06:18 2018
    On 02/10/18, apam said the following...

    If there is a backtrace/stack, what's the issue with finding the bug?

    I couldn't make sense of it.

    Acutally....


    *** Error in `./magicka': corrupted double-linked list:
    0x0000000001d01f60 ***

    That error is in malloc(). I'd definitely look for a buffer overflow
    or a double-free.

    --- Mystic BBS v1.12 A38 2018/01/01 (Windows/32)
    * Origin: ACiD Telnet HQ / blackflag.acid.org (21:1/112)
  • From tenser@21:1/112 to tenser on Fri Feb 9 23:09:14 2018
    On 02/09/18, tenser said the following...

    On 02/10/18, apam said the following...

    If there is a backtrace/stack, what's the issue with finding the

    I couldn't make sense of it.

    PS: Try building with and running under ASAN. That might point you in
    the right direction: https://github.com/google/sanitizers/wiki/AddressSanitizer

    --- Mystic BBS v1.12 A38 2018/01/01 (Windows/32)
    * Origin: ACiD Telnet HQ / blackflag.acid.org (21:1/112)
  • From apam@21:1/125 to tenser on Sat Feb 10 15:07:27 2018
    the right direction: https://github.com/google/sanitizers/wiki/AddressSanitizer

    Thanks a lot, that was a big help. I found a couple of errors with this.

    Still not 100% sure those errors were the cause of the crash, though they potentially could be the cause of other crashes, so glad to have them
    fixed.

    Andrew

    --- MagickaBBS v0.10alpha (Linux/x86_64)
    * Origin: Exotica BBS - telnet://exoticabbs.com:2023/ (21:1/125)
  • From Avon@21:1/101 to apam on Sat Feb 10 20:41:37 2018
    On 02/10/18, apam pondered and said...

    My BBS crashed today, and I have no idea why. Well it sort of crashed, it was still running, but something crashed because there was a linux backtrace in the screen.

    Poop, sorry to hear this... if there's anything I can do to help, lemme know..

    Best, Paul


    `I'm not expendable, I'm not stupid, and I'm not going' - Kerr Avon, Blake's 7

    --- Mystic BBS v1.12 A37 2017/12/30 (Windows/32)
    * Origin: Agency BBS | Dunedin, New Zealand | agency.bbs.nz (21:1/101)
  • From tenser@21:1/112 to apam on Sat Feb 10 20:22:56 2018
    On 02/10/18, apam said the following...

    Thanks a lot, that was a big help. I found a couple of errors with this.

    Still not 100% sure those errors were the cause of the crash, though they potentially could be the cause of other crashes, so glad to have them fixed.

    Sure; happy to help out. There are other sanitizers that might help
    you find latent errors as well: msan, ubsan (undefined behavior) and
    tsan (threads) among them. Often times when you see a fault somewhere
    deep in libc, like in malloc(), it's due to something one of those
    might pick up.

    --- Mystic BBS v1.12 A38 2018/01/01 (Windows/32)
    * Origin: ACiD Telnet HQ / blackflag.acid.org (21:1/112)