Time |
Nick |
Message |
06:33 |
|
collum joined #evergreen |
07:07 |
|
kworstell-isl joined #evergreen |
07:08 |
|
redavis joined #evergreen |
08:08 |
|
cbrown joined #evergreen |
08:10 |
|
BDorsey joined #evergreen |
08:26 |
|
mantis joined #evergreen |
08:33 |
|
dguarrac joined #evergreen |
08:41 |
|
mmorgan joined #evergreen |
08:45 |
|
BDorsey joined #evergreen |
10:25 |
|
kworstell-isl joined #evergreen |
10:27 |
|
kworstell-isl joined #evergreen |
10:50 |
|
kmlussier joined #evergreen |
10:50 |
|
kworstell_isl joined #evergreen |
10:50 |
kmlussier |
Somebody with more Launchpad powers than mine may want to change bug 1810854 back to Confirmed. |
10:50 |
pinesol |
Launchpad bug 1810854 in Evergreen "Trying to Merge a User in Collections Fails Silently" [Medium,Fix released] https://launchpad.net/bugs/1810854 |
10:52 |
kmlussier |
Also, good morning #evergreen! |
10:52 |
redavis |
I'll get at it. Hold on. |
10:53 |
redavis |
Also, good morning |
10:54 |
redavis |
kmlussier, fixed |
10:54 |
Bmagic |
I think I might have found the smoking gun on this circulation problem: circulator: HASH(0x55eb1b789438) : circ due data / close date overlap found : due_date=2024-12-19T09:51:46-0600 start=2024-11-27T22:59:00-06:00, end=2 |
10:54 |
Bmagic |
025-12-01T23:00:59-06:00 |
11:01 |
kmlussier |
redavis++ |
11:02 |
kmlussier |
heh, I probably still have the credentials for that bugmaster account stored in my pw manager. I probably could've fixed it myself. |
11:04 |
redavis |
no worries :-). I don't like to get into it unless I have a specific reason, and then log out pretty immediately and then back in as myself. |
11:21 |
|
abowling joined #evergreen |
11:27 |
csharp_ |
@decide sudo make me a sandwich or run0 make me a sandwich |
11:27 |
pinesol |
csharp_: go with run0 make me a sandwich |
11:34 |
jeff |
Bmagic: that looks likely. are you finding the closed date now, where you didn't find it before, or is it still not present in the table and you're possibly dealing with cached in-process data? |
11:35 |
Bmagic |
I think I trusted the staff when they told me that the closed dates were sound |
11:35 |
Bmagic |
:) |
11:35 |
jeff |
you had also indicated that it was not happening with all circs -- did you find a reason for that also? |
11:36 |
|
Christineb joined #evergreen |
11:36 |
Bmagic |
yeah, the closed dates were either fixed overtime (but not completely), but the way the closed dates look today, there are about 7 branches affected by the thanksgiving holiday, closed for 368 days instead of 2 days |
11:37 |
Bmagic |
so simple |
11:38 |
mmorgan |
hindsight++ |
11:38 |
Bmagic |
Sorry to bother everyone about it! Yall are so great! |
11:39 |
Bmagic |
but this segfault issue..... |
11:45 |
Bmagic |
Last night at 23:08, 8 minutes after the print action trigger fired, the OpenSRF router had a segfault and died. The nice thing is that the system freezes at that point and I get to see all of the logs and current processes. Here's what I've compiled so far: https://pastebin.com/YVizyW7Q |
11:46 |
Bmagic |
I'm thinking I'm going to setup a new machine with ejabberd and let it run for a few days to see if I have the same issue. |
11:49 |
* berick |
looks |
11:59 |
berick |
Bmagic: mind restarting and running both routers in gdb and trying to reproduce? i can help w/ the command |
12:00 |
Bmagic |
sure! Wanna continue with the redis machine? |
12:00 |
Bmagic |
It's still powered on, you caught me just before I deleted it |
12:00 |
berick |
yes, this is almost certainly an error introduced in the redis C code |
12:02 |
Bmagic |
ok, I'm game |
12:03 |
berick |
oh right, we can attach to running processes... |
12:03 |
berick |
so just restart everything like normal |
12:03 |
Bmagic |
I've isolated the issue to this machine by trial and error. The "main" util server isn't having issues, now that I've divided the crons up. And this one cron seems to be the one that consistently causes the segfault. In other words: we can use this as a test bed. |
12:04 |
Bmagic |
ok, I'm leaving the machine alive, and restarting everything like normal |
12:06 |
berick |
once it's up, get the PIDs of the 2 router proceses, open 2 terminals and run this for each pid: gdb /openils/bin/opensrf_router <pid> |
12:06 |
berick |
i /think/ that will do it |
12:06 |
berick |
you also have to enter 'continue' once gdb loads the router |
12:06 |
berick |
well, each router |
12:07 |
berick |
i'll try it here too.. |
12:07 |
Bmagic |
this file doesn't exist: /openils/bin/opensrf_router |
12:08 |
|
jihpringle joined #evergreen |
12:08 |
Bmagic |
I found it: /openils/bin/opensrf_router |
12:08 |
Bmagic |
must have been a typo or something |
12:09 |
berick |
you may have to sudo to run gdb |
12:09 |
berick |
ok seems to work here, and you do have to enter 'continue' once gdb loads |
12:09 |
berick |
once they're both loaded and continued, start trying to break things again |
12:10 |
Bmagic |
Could not attach to process. If your uid matches the uid of the target |
12:10 |
berick |
try sudo |
12:10 |
Bmagic |
maybe no-go in docker environment? |
12:10 |
Bmagic |
I am root |
12:10 |
berick |
huh |
12:10 |
berick |
dang |
12:11 |
Bmagic |
ptrace: Operation not permitted |
12:11 |
Bmagic |
cat /proc/sys/kernel/yama/ptrace_scope |
12:11 |
Bmagic |
1 |
12:12 |
Bmagic |
I tried it on non-docker, works fine |
12:13 |
Bmagic |
I can get this rig setup outside of the container and do the gdb, and what? keep it running in a screen so we can get more output when it crashes next time? |
12:14 |
berick |
right, once the router crashes, you can type 'bt' or 'backtrace' and it will show the full error output with line numbers. |
12:15 |
Bmagic |
I'm all over it, no problem. I'm afraid that it won't break on the new machine thought :) we shall see |
12:15 |
berick |
yeah... |
12:15 |
berick |
Bmagic++ |
12:16 |
Bmagic |
berick++ |
12:22 |
berick |
Bmagic: in the meantime, something like this might work? i've used it, but it's easy to try: addr2line -e /openils/bin/opensrf_router 7a496e327000+20000 |
12:22 |
berick |
*I've never used it |
12:23 |
berick |
oops, that would be /openils/lib/libopensrf.so.2.2.0 |
12:23 |
berick |
instead of /openils/bin/opensrf_router |
12:24 |
Bmagic |
!! let me see |
12:24 |
Bmagic |
how do I discover what hex to append? |
12:25 |
berick |
i think it's the stuff in libopensrf.so.2.2.0[7a496e327000+20000] from your log lines, but I'm not 100% sure |
12:26 |
Bmagic |
addr2line /openils/lib/libopensrf.so.2.2.0 7a496e327000+20000 |
12:26 |
Bmagic |
addr2line: 'a.out': No such file |
12:26 |
Bmagic |
I've recycled the processes since the log was outputted, will it be the same? |
12:26 |
berick |
put -e before the /openils.. part |
12:27 |
Bmagic |
oops, right |
12:27 |
Bmagic |
addr2line -e /openils/lib/libopensrf.so.2.2.0 7a496e327000+20000 |
12:27 |
Bmagic |
??:0 |
12:27 |
berick |
yeah.. figured |
12:29 |
Bmagic |
oh, you know, maybe I can gdb the process from the VM above the container |
12:30 |
Bmagic |
if I feed it the exact same opensrf_router (it can't be the same one that inside the container, but a same copy of it) will that work? |
12:30 |
Bmagic |
it just needs a reference to the code that's running the program in memory so it can track the lines back? |
12:30 |
berick |
no gdb has to attach to the running processes |
12:31 |
Bmagic |
a docker container exposes it's processes to the VM above. I can use it's PID and run gdb on the VM |
12:31 |
berick |
oh, you mean the binary.. |
12:31 |
berick |
maybe? |
12:31 |
Bmagic |
I'll try |
12:32 |
Bmagic |
that would be nice, so I'm testing the same situation where I've had this segfault a few times |
12:34 |
berick |
Bmagic: also https://stackoverflow.com/questions/21395106/how-can-i-gdb-attach-to-a-process-running-in-a-docker-container |
12:35 |
berick |
hm, don't see how that's really different from what you already tried |
12:35 |
Bmagic |
lxc-attach is the magic |
12:35 |
Bmagic |
berick++ |
12:37 |
Bmagic |
lol lxc-attach: command not found |
12:39 |
berick |
Bmagic: oh, maybe attach with: docker exec --privileged -it <container> bash |
12:39 |
Bmagic |
ok, thanks, I was getting to that actually |
12:39 |
Bmagic |
yes, that seems to have done it |
12:40 |
Bmagic |
getting into the machine via docker with the --privileged switch, is making gdb happy now |
12:40 |
berick |
yay |
12:40 |
Bmagic |
sing host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". |
12:40 |
Bmagic |
0x000071a1fe8277e2 in __GI___libc_read (fd=3, buf=0x7fffec4b6630, nbytes=16384) at ../sysdeps/unix/sysv/linux/read.c:26 |
12:40 |
Bmagic |
26 ../sysdeps/unix/sysv/linux/read.c: No such file or directory. |
12:40 |
Bmagic |
not sure if it's completely happy |
12:40 |
berick |
i think that's ok. just do 'continue' once it's stopped loading |
12:41 |
Bmagic |
(gdb) continue |
12:41 |
Bmagic |
Continuing. |
12:41 |
berick |
cool |
12:41 |
Bmagic |
it's chillin now |
12:41 |
berick |
so get both routers attached |
12:41 |
berick |
and start the attack |
12:41 |
Bmagic |
ok, well, I wasn't prepared for it to work. I need to back out and get some screens going |
12:41 |
berick |
heh |
12:41 |
|
Dyrcona joined #evergreen |
12:47 |
|
kworstell-isl joined #evergreen |
12:49 |
Dyrcona |
vm snapshots++ |
12:51 |
Bmagic |
berick: no segfault yet, Imma try resetting more triggers, going back 2 days worth |
12:53 |
berick |
cool. i have to disappear for about an hour |
13:02 |
Bmagic |
no segfault! 2 days worth of stuff just finished (~5k events). well, I guess I'll just leave it running in debug until next natural cron execution. It seems to happen at night (probably because the pressure is higher during those hours). |
13:05 |
|
jihpringle joined #evergreen |
13:09 |
Dyrcona |
Y'know what? It installs. That's good enough for me at this point. |
13:20 |
redavis |
Dyrcona++ |
13:39 |
mmorgan |
Dyrcona++ |
13:56 |
csharp_ |
thinking_you_have_a_vm_snapshot_when_you_don't-- |
13:57 |
csharp_ |
in better news, the spaghetti carbonara I made the fam for lunch was delicious |
13:58 |
csharp_ |
@band add Waiting For EJabberD |
13:58 |
pinesol |
csharp_: Band 'Waiting For EJabberD' added to list |
14:15 |
Dyrcona |
csharp_++ carbonara++ |
14:18 |
csharp_ |
@band add Require All Grandad |
14:18 |
pinesol |
csharp_: Band 'Require All Grandad' added to list |
14:19 |
csharp_ |
@band add Automatic for your PeePaw |
14:19 |
pinesol |
csharp_: Band 'Automatic for your PeePaw' added to list |
14:19 |
|
mantis joined #evergreen |
14:31 |
* Dyrcona |
got interrupted while forward porting db upgrades. |
14:43 |
|
jihpringle joined #evergreen |
14:45 |
csharp_ |
mantis++ Dyrcona++ mmorgan++ redavis++ |
14:45 |
mantis |
csharp++ |
14:48 |
pinesol |
News from commits: Forward port 3.14.0 to 3.14.1 upgrade DB script <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=c425b37e063838880784f78af9dab410c4b6a5d3> |
14:48 |
pinesol |
News from commits: Forward port 3.13.5 to 3.13.6 upgrade DB script <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=1ee01707fe00e33daf08d64dbf9dd66c22f1d0d6> |
14:48 |
pinesol |
News from commits: Forward port 3.12.8 to 3.12.9 upgrade DB script <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=5b7d3e49f2aba487878e8ea54c42ebfd5dca798b> |
14:49 |
mmorgan |
csharp_++ Dyrcona++ mantis++ redavis++ |
14:49 |
Bmagic |
csharp_++ Dyrcona++ mantis++ redavis++ |
14:49 |
Bmagic |
mmorgan++ |
14:49 |
redavis |
csharp_++ Dyrcona++ mantis++ mmorgan++ |
15:00 |
|
kmlussier1 joined #evergreen |
15:01 |
|
mmorgan1 joined #evergreen |
15:25 |
Dyrcona |
mmorgan++ mantis++ abneiman++ redavis++ csharp_++ |
15:29 |
mantis |
Dyrcona++ abneiman++ redavis++ mmorgan++ |
15:35 |
redavis |
Goodness gracious |
15:36 |
|
mantis left #evergreen |
15:38 |
mmorgan |
abneiman++ |
15:40 |
abneiman |
redavis++ Dyrcona++ mantis++ mmorgan++ csharp_++ |
15:40 |
abneiman |
karma party lol |
15:41 |
redavis |
abneiman++ #all day every day |
15:47 |
csharp_ |
anyone know the purpose of the libnspr4-dev dependency? |
15:48 |
csharp_ |
(https://firefox-source-docs.mozilla.org/nspr/index.html) |
15:48 |
Bmagic |
It combs the IT person's hair after getting frazzled from figuring out the rest of the stack |
15:48 |
csharp_ |
my fry-fro is all frizzy |
15:48 |
Bmagic |
lol |
15:49 |
Bmagic |
sure looks like xul related |
15:49 |
Dyrcona |
It might have been required for XulRunner, not sure. |
15:49 |
csharp_ |
that's my working theory |
15:49 |
Bmagic |
delete it on production. That'll shake out the "reason" via tickets |
15:50 |
Dyrcona |
We don't link to it anywhere, so it must have been for XulRunner. |
15:53 |
berick |
huh, maybe from when we used to execute JS on the server, pre-nodejs.. was it spidermonkey? |
15:53 |
berick |
just guessing |
15:53 |
Dyrcona |
Could be.. I remember spidermonkey, though it was yanked about the time that I really got started. |
15:54 |
csharp_ |
gonna throw caution to the wind and just leave it out of my "build on Rocky" branch |
15:54 |
csharp_ |
not seeing evidence of linkage anywhere here either |
15:57 |
|
kmlussier joined #evergreen |
15:57 |
|
mmorgan1 joined #evergreen |
15:57 |
Dyrcona |
It is probably something that we do not need any longer. |
15:58 |
* Dyrcona |
wonders if spidermonkey is still around... |
15:59 |
Dyrcona |
SpiderMonkey is Mozilla’s JavaScript and WebAssembly Engine, used in Firefox, Servo and various other projects. It is written in C++, Rust and JavaScript. You can embed it into C++ and Rust projects, and it can be run as a stand-alone shell. It can also be compiled to WASI |
16:00 |
Bmagic |
that's the name of the chrome extension that I used to facilitate management of JS cod that I wanted to run on websites to automate things |
16:01 |
Dyrcona |
yeah, an extension called spidermonkey also rings a bell. |
16:12 |
abneiman |
bluecross-- |
16:12 |
Dyrcona |
Greasemonkey... That's the extension that I'm thinking of. |
16:12 |
abneiman |
bluecross-- |
16:12 |
abneiman |
bluecross-- |
16:12 |
abneiman |
guess what I'm doing right now |
16:12 |
Dyrcona |
signing your parents up for medicare coverage? |
16:13 |
abneiman |
trying to convince blue cross that I do not, in fact, want to give them a 4x payment |
16:13 |
abneiman |
after they borked my autopay this month |
16:13 |
Dyrcona |
Well, my next guess was being denied coverage.... |
16:13 |
abneiman |
I've already sat through a lecture from them on How Email Works |
16:14 |
redavis |
Oh man, I looked away from IRC for five minutes! |
16:15 |
Dyrcona |
@dessert abneiman |
16:15 |
* pinesol |
grabs some EG2 Cookies for abneiman |
16:15 |
redavis |
lol, noooooo. |
16:17 |
abneiman |
Dyrcona++ |
16:26 |
redavis |
Friends, I'm spent. "See" y'all tomorrow. |
16:39 |
|
kmlussier joined #evergreen |
17:05 |
|
mmorgan1 left #evergreen |
17:09 |
Bmagic |
how email works: well kids, a long time ago, before you were born, humans created fire. Not long after, they started using it to send messages across great distances. The first protocol was SFTP (simple fire transfer protocol). What eventually became SMTP, once we started puting smoke into TCP. And that's about when you came along. Isn't that great!? |
17:12 |
|
kmlussier1 joined #evergreen |
17:12 |
Bmagic |
Waring tribes started putting fake signals into the system, and the early humans needed to detect these. So they created DKIM (detected kudies in message). |
17:13 |
abneiman |
Bmagic++ |
17:13 |
abneiman |
much better than what bluecross was going for |
17:14 |
Bmagic |
:) |