Evergreen ILS Website

IRC log for #evergreen, 2016-02-16

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
06:40 rlefaive joined #evergreen
07:44 mrpeters joined #evergreen
07:44 JBoyer joined #evergreen
08:02 * csharp wishes he had a dime for every time he says "God, I f*-ing hate reports" under his breath
08:24 tsbere csharp: Why are you wishing small? Wish for $100 for every time. :P
08:28 jeff if i had a nickel for every time someone had intentionally run an Evergreen reporter report in the last year, I'd have... hrm. $0.05.
08:28 collum joined #evergreen
08:29 * tsbere comments on bug 1486592 but doesn't examine it much more closely
08:29 pinesol_green Launchpad bug 1486592 in Evergreen "Copies in concerto data should have prices" [Wishlist,Confirmed] https://launchpad.net/bugs/1486592
08:31 Dyrcona joined #evergreen
08:36 jeff based on the longer-than-usual delay between "job started" and "job's done" emails hitting my inbox, i think our statewide resource sharing catalog bibs are loading properly again.
08:42 Dyrcona jeff: That's good. I usually have those jobs spit out what is going on, including any errors.
08:42 Dyrcona Some jobs I only have it send me the errors.
08:43 rjackson_isl joined #evergreen
08:45 jeff yeah, this is on a system we don't control. we just deliver the files, and at the usual time each day i get a series of three emails for each file.
08:45 jeff begin, success (with a detailed log), end
08:45 jeff or sometimes: begin, no files to load, end
08:47 jeff and every so often, begin, failure, end.
08:47 jeff good to have checks for things like "are we sending them files" and "are they loading files"
08:47 jeff but what i was not checking for was "is the file we are sending them identical to the last file we sent them?"
08:48 mmorgan joined #evergreen
08:55 rlefaive joined #evergreen
09:00 Dyrcona I've got one that checks for zero byte files, 'cause we may not have added new records on a weekend.
09:01 Dyrcona But, I don't worry if the files are identical, because they shouldn't be. That would typically mean going a month without us adding or deleting records and that's inconceivable. :)
09:03 jeff yeah, in this case it was a sequence conflict on a state table that's used to generate incrementals.
09:03 jeff meaning no new entries in the state table, so for the past several mornings the update file had been identical.
09:04 jeff so, the better thing to check going forward will be "are there new entries in the state table?" :-)
09:04 Dyrcona Sounds like it, but I don't know exactly what your process is.
09:05 jeff Dyrcona: it's one of those processes that i wouldn't know exactly what the process is if i didn't have it written down.
09:05 Dyrcona :)
09:06 Dyrcona If it has to do with ILL, I thought I knew the process, but now I'm not so sure.
09:06 jeff shifting gears a bit, for those on Apache 2.4 did you find that you need to further limit the number of requests each Apache child processes before being recycled?
09:06 Dyrcona Let me check.
09:07 jeff as in, use a higher value for MaxConnectionsPerChild / MaxRequestsPerChild than before?
09:07 Dyrcona I think we had increased things dramatically, but when we split so public and staff were on different bricks, we had to let Apache drop to defaults on public to avoice crashes.
09:07 jeff default is 0, meaning no limit to "the amount of memory that process can consume by (accidental) memory leakage" :-)
09:09 jeff on Debian Jessie with Apache 2.4 and recent master, I'm able to run a 4 GB machine out of memory rather quickly. I'm upping the RAM for the VM, but adjusting MaxConnectionsPerChild does seem to quite effectively resolve the symptoms.
09:10 Dyrcona jeff: Yeah, on public I recently set MaxConnectionsPerChild to 1000, MaxSpareServers 20, MaxRequestWorkers 120, the rest are at defaults.
09:11 Dyrcona By recently, I mean the file was last changed on Dec. 1.
09:11 * jeff nods
09:11 jeff thanks for looking!
09:11 Dyrcona IIRC, it took a few weeks of tinkering.
09:11 Dyrcona I'll check our staff brick, too. It's configuration is different.
09:11 jeff do you recall if "hey, we're running out of memory" was the primary motivation behind the change?
09:12 Dyrcona Yes, it was.
09:12 Dyrcona Sometimes it was running out of memory. Other times, the box looked mostly OK, but load was stratospheric.
09:13 Dyrcona This VM has 24GB of RAM configured, too. :)
09:13 Dyrcona Presently, it is using 6.6GB, or 3.0GB +/- buffers and cache.
09:14 Dyrcona It also has 24GB of swap, and yes we were seeing all of it consumed.
09:15 Dyrcona We could probably drop the RAM down to 16GB or maybe as low as 8GB at this point.
09:15 jwoodard joined #evergreen
09:16 Dyrcona The Apache vm on the staff brick has the same memory configuration, but is using 12GB (11GB +/- buffers and cache).
09:16 Dyrcona They're both using about 10MB of swap, btw.
09:17 jeff which mostly means that something pressured things to swap at some time probably long ago. :-)
09:17 Dyrcona Yes, I don't worry about swap usage much. It doesn't seem to affect performance until you hit about 75%.
09:18 Dyrcona It's probably just some data that was needed at startup and *might* be needed later.
09:18 Dyrcona So, for the staff brick things are a bit different.
09:19 Dyrcona We StartServers 100, MinSpareServers 25, MaxSpareServers 150, MaxRequestWorkers 500, and MaxConnectionsPerChild 1000.
09:20 Dyrcona As I recall, the default for MaxRequestWorkers is 256, and I was seeing us starting to run out of RAM on the public side with around 150 Apache processes going.
09:20 Dyrcona Thus, I chose 120 for public.
09:21 Dyrcona We've found that the use patterns for the OPAC and the staff client are very different.
09:21 Dyrcona So, splitting the bricks and configuring Apache differently for each has really helped.
09:22 Dyrcona The other thing is, we had a lot of idle Apache processes on the public side with our original settings.
09:22 Dyrcona We have fewer idlers now.
09:24 jeff since in my experience "bricks" often gets applied to different things, can you clarify your usage? :-)
09:26 tsbere jeff: We have two 3 vm "bricks", one for public and one for staff.
09:26 tsbere (we also have single utility and sip vms)
09:27 tsbere No load balancing or anything
09:28 jeff with no load balancing, what components run on each of the three VMs in a given brick?
09:28 jeff is public apache all on one of the three VMs in the public brick?
09:29 jeff or am i failing to understand your answer?
09:29 tsbere For both 3 vm bricks Apache is one vm, ejabberd/router the second (settings as well), all other drones on the third
09:29 jeff okay, got it.
09:31 Dyrcona Yeah, not really what people think of as the traditional brick setup.
09:32 RoganH joined #evergreen
09:32 jeff 'swhy i asked. :-)
09:32 Dyrcona It seems to work for us that way, and certainly better than when we ran it all on 1 machine.
09:32 * jeff nods
09:33 jeff how many physical machines are you spread across now?
09:33 Dyrcona 'Course we later found out that the RAID on that 1 machine was switching between spares because one of the main drives had died.
09:33 Dyrcona Counting the database server, 3.
09:34 Dyrcona There's one physical machine for each "brick."
09:35 jeff and the sip and utility VMs are shoved in there somewhere also?
09:35 Dyrcona Yes. The utility runs on the staff brick hardware.
09:36 Dyrcona And sip runs on the public side.
09:36 * Dyrcona had to check to make sure.
09:36 jeff heh
09:36 jvwoolf joined #evergreen
09:36 Dyrcona tsbere is away from his desk. He would just know the answer.
09:37 Dyrcona Four vms on each physical machine.
09:37 tsbere Note that the sip/utility vms are standalone bricks themselves
09:37 tsbere Also, if you want to count NFS/Logging there are 4 machines
09:37 jeff heh
09:38 jeff nfs/logging on a fourth physical machine? did you stick memcached there also, or somewhere else?
09:38 tsbere I think I stuck memcached on the DB server, actually
09:38 Dyrcona Is memcached still running on the db server?
09:38 Dyrcona heh
09:38 Dyrcona I forgot that sip and utility run their own ejabberd. I remembered that they do run drones and listeners.
09:39 * tsbere wanted the "public" side to work even when the NFS box was down, so it has copies of all the configs instead of symlinks
09:39 tsbere That includes sip, the utility vm and the staff brick all use NFS-hosted files directly
09:39 Dyrcona The logging box would be an OK place for memcached, though.
09:39 jeff with a single staff VM for all staff apache processes, what kinds of things do you end up using NFS for?
09:40 jeff usual things like reporter output don't seem to apply, though maybe if you're running clark on the utility VM...
09:40 tsbere jeff: Well, for starters, reports and utility stuff end up moving from the utility VM to the staff apache VM
09:41 Dyrcona So, yeah, we do run clark on the utility vm.
09:41 tsbere jeff: Also, I think there is at least one part of the system that apache writes files but drones read them, thus those VMs need to talk to NFS
09:47 maryj joined #evergreen
09:51 jeff tsbere: ah, probably something in vandelay?
09:51 * jeff tries to think
09:52 tsbere jeff: Don't recall if it was vandelay or offline circ
09:52 tsbere That being a programmer's or, meaning it could be both. :P
09:53 jeff heh
10:01 yboston joined #evergreen
10:01 rlefaive_ joined #evergreen
10:20 Christineb joined #evergreen
10:33 Guest16800 left #evergreen
12:01 rlefaive joined #evergreen
12:02 bmills joined #evergreen
12:09 jihpringle joined #evergreen
12:49 pinesol_green [evergreen|Jason Stephenson] LP 1499123: Add release notes. - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=eabd816>
12:49 pinesol_green [evergreen|Jason Stephenson] LP 1499123: Modify Perl code for csp.ignore_proximity field. - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=831a808>
12:49 pinesol_green [evergreen|Jason Stephenson] LP 1499123: Add ignore_proximity to config.standing_penalty. - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=63205ed>
13:23 bmills1 joined #evergreen
13:32 bmills joined #evergreen
13:53 berick Dyrcona: FYI @ http://git.evergreen-ils.org/?p=working/random.g​it;a=shortlog;h=refs/heads/collab/berick/pingest -- adding some options to your record ingest script.  let me know if you ever put it on github or similar, i'll send a pull request.
13:55 Dyrcona berick: Cool! I'll have to give your changes a try some time, soon.
14:09 Stompro joined #evergreen
14:09 kmlussier @hate scheduling meetings
14:09 pinesol_green kmlussier: But kmlussier already hates scheduling meetings!
14:10 berick @reallyhate
14:10 pinesol_green berick: Try restarting apache.
14:10 kmlussier @loathe scheduling meetings
14:10 pinesol_green kmlussier: http://www.firstpersontetris.com/
14:15 Dyrcona git apply is not being my friend.
14:17 Dyrcona I downloaded bericks change above as a patch but git refuses to apply it.
14:17 Dyrcona At first it complains about a whitespace change.
14:17 Dyrcona After I fix that it, it says nothing but the file is upatched.
14:18 berick maybe a -p  or --directory param?
14:18 berick the path is different
14:18 Dyrcona Yeah, I did it in the directory where the file lives. I'll try -p.
14:21 Dyrcona -p 1 didn't help, but doing it from my root and secifying --directory=./perl did.
14:22 berick cool
14:26 Dyrcona Patch applied in a test branch. I'll have to test it later this week.
14:27 jeff oh hey, i got this to happen again:
14:28 jeff OpenSRF Drone [open-ils.search]
14:28 jeff \_ OpenSRF Drone [open-ils.search]
14:28 jeff \_ OpenSRF Drone [open-ils.search]
14:32 jeff alas:
14:32 jeff Your search - "opensrf listener" "gets confused" "thinks it's a drone" - did not match any documents.
14:33 gmcharlt jeff: does it seem to actually be confused, or is it just that the process name is wrong?
14:33 jeff ejabberd sends it a message, it ignores it.
14:33 * Dyrcona wonders if it didn't die and the oldest child somehow got promoted, not sure if each becomes its own process group leader.
14:34 * Dyrcona thinks they do, so that shouldn't happen, but....
14:34 jeff pid and process start time indicate that it is the Listener process, but it has changed name and behavior (but not pid or process start time)
14:34 Dyrcona jeff: OK. That is strange. I've never seen that happen.
14:35 jeff unusual circumstances: i've been intentionally running this machine out of memory by hammering it with tpac search requests, and OOM killer has killed apache processes.
14:35 Dyrcona Right. I thought that might be part of it.
14:36 jeff debian jessie, recent master of both opensrf and evergreen.
14:39 Dyrcona Oh, beauty... git log -p reveals I have a byte order marker in a SQL query.
14:39 jeff i can't reliably reproduce outside of the "i've done it twice in the past two days"
14:39 Dyrcona Well, you're deliberately over stressing the system. I'd expect the listener to just die, though, and not start acting like a drone.
14:40 jeff i'm wondering if the behavior could result from a failed attempt to fork.
14:40 jeff that might be way off base, though.
14:41 Dyrcona Query still works, though.
14:42 Dyrcona I'm not sure what would happen in that case.
14:42 Dyrcona Typically, if fork fails, you exit your program.
14:43 jeff oh, that's exactly what's happening.
14:43 jeff $child->{pid} = fork();
14:43 jeff perl fork() returns undef if the fork failed.
14:43 jeff if($child->{pid}) { # parent process
14:43 jeff ...
14:43 Dyrcona Yes.
14:43 jeff } else { # child process
14:43 Dyrcona That looks like the culprit. Good catch!
14:44 jeff and in that else $child->{pid} gets set to $$ and we eval $child->init, which is where $0 gets set to OpenSRF Drone [$service]
14:46 jeff we should probably treat 0 and undef differently.
14:46 Dyrcona Yes, definitely.
14:47 Dyrcona It would probably be safe to log the failure to fork a new child and then do nothing if the result of fork is undef.
14:47 * jeff nods
14:48 jeff of course, that could lead to a situation where we chew CPU because there's no memory, so perhaps a sleep or some other backoff would be suitable also.
14:48 jeff but at that point, you're probably already in deep trouble.
14:50 Dyrcona Yeah, when you're out of resources and can't fork a process, it might be a good idea to hang it up and go home. :)
14:55 kmlussier Calling 0951
15:01 * Dyrcona decided to fire off a pingest.pl test on his dev vm anyway.
15:02 krvmga joined #evergreen
15:03 pinesol_green [evergreen|Kathy Lussier] LP 1499123: Stamping upgrade script for standing-penalty-ignore-proximity - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=c1b64bf>
15:06 mmorgan1 joined #evergreen
16:01 jlitrell joined #evergreen
16:05 Dyrcona berick++ I added and pushed your patch for pingest.pl.
16:05 pinesol_green [evergreen|Chris Sharp] LP#1486592 - Generate prices for concerto dataset. - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=05e9a08>
16:06 berick Dyrcona++
16:07 mmorgan joined #evergreen
16:08 vlewis joined #evergreen
16:09 rlefaive joined #evergreen
16:23 JBoyer gmcharlt++ # Helping Clark overcome amnesia
16:50 vlewis_ joined #evergreen
16:53 vlewis joined #evergreen
16:57 vlewis_ joined #evergreen
17:07 ddale joined #evergreen
17:07 mmorgan left #evergreen
17:07 kmlussier gmcharlt: I was just looking at bug 1067823 again. Do you know why MARC tag 659 was added to the definition for the genre field? I would have thought we would be using the mods definition there, and I don't see any documentation that points to 659 being a genre field.
17:07 pinesol_green Launchpad bug 1067823 in Evergreen "tpac: genre links in record details page launch subject search" [Medium,Confirmed] https://launchpad.net/bugs/1067823 - Assigned to Kathy Lussier (klussier)
17:10 gmcharlt kmlussier: as near as I can tell, it's a legacy of the NLM using 659 for genre back in the days when dinosaurs roamed the earth
17:11 gmcharlt and my including it in that patch was basically just cargo-culting fromOpen-ILS/src/templates/op​ac/parts/record/subjects.tt2
17:11 gmcharlt all of that said: I agree it's non-standard, and I certainly have no object to just sticking with 655
17:11 gmcharlt *objectin
17:11 gmcharlt *objection
17:11 gmcharlt @coffee me
17:11 * pinesol_green brews and pours a cup of Bonsai Blend Espresso, and sends it sliding down the bar to me
17:14 kmlussier gmcharlt: OK, I'll play with that branch a bit.
17:14 kmlussier @coffee gmcharlt
17:14 * pinesol_green brews and pours a cup of Ethiopia Yirgacheffe, and sends it sliding down the bar to gmcharlt
17:14 kmlussier Good luck sleeping on that
17:14 * gmcharlt buzzes
17:40 kmlussier I actually see two records in a production system with a 659. But, in one case, I'm pretty sure it's a mistake since I've never heard of a genre called "shark attacks"
17:41 gmcharlt kmlussier: you must not watch the Syfy channel ;)
17:41 gmcharlt but yeah, 2 records is just noise
18:28 jihpringle_ joined #evergreen
18:30 dluch_ joined #evergreen
18:32 dbs_ joined #evergreen
18:32 berick_ joined #evergreen
18:37 jwoodard @decide haiku or no
18:37 pinesol_green jwoodard: go with haiku
18:44 jwoodard Warm wind blowing free, February moves onward, birds chirp happily.
18:44 Bmagic joined #evergreen
18:44 hopkinsju joined #evergreen
18:44 dluch joined #evergreen
18:45 _bott_ joined #evergreen
19:53 book` joined #evergreen
20:15 jlitrell Ahh, February / Snow, snow, snow, rain, rain, snow, rain / I can't feel my feet.
20:17 jeff heh
20:17 jeff jwoodard++ jlitrell++
20:27 bmills joined #evergreen
21:55 scrawler joined #evergreen
21:56 scrawler anybody here this evening?
21:57 scrawler see you later...
21:57 scrawler left #evergreen
21:57 jeff patience...
22:09 phil___ joined #evergreen

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat