Evergreen ILS Website

IRC log for #evergreen, 2017-09-22

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
00:33 Jillianne joined #evergreen
01:04 jeff_ joined #evergreen
01:35 Stompro joined #evergreen
02:13 Jillianne joined #evergreen
03:34 RBecker joined #evergreen
04:14 sandbergja joined #evergreen
04:24 abowling joined #evergreen
05:00 pinesol_green News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
05:55 book` joined #evergreen
07:13 rjackson_isl joined #evergreen
07:17 Dyrcona joined #evergreen
07:59 csharp gmcharlt: reporting back that master as of yesterday morning installs and runs fine on xenial (aside from the grunt --all errors I shared yesterday)
08:43 Stompro joined #evergreen
08:44 _adb joined #evergreen
08:48 collum joined #evergreen
08:55 mmorgan joined #evergreen
09:03 bshum csharp: I think that's "normal", or I think so anyways.
09:04 bshum Though having them show as "ERROR" does look bad, heh
09:04 Bmagic This command in the upgrade to 3.0beta: UPDATE biblio.record_entry SET vis_attr_vector = biblio.calculate_bib_visibility_attribute_set(id);
09:05 Bmagic Is still running this morning (15 hours)
09:06 Bmagic Should it be looking at deleted bibs? When I do this on production, can I skip that command and run it after the rest of the upgrade script is finished?
09:06 Dyrcona Bmagic: It depends, and yes.
09:07 Dyrcona However, search may be "off" until it finishes.
09:07 Bmagic What if I craft a script to run that command on a single bib at a time
09:08 Dyrcona It wont' be any faster.
09:08 Bmagic search would slowly be "on" and we can start using the system while it runs?
09:09 Bmagic each bib it finishes will be included in search results? Sort of like an ingest?
09:09 Dyrcona Well, I mean search results might seem wrong until all records have had the visibility updated.
09:09 Dyrcona Sort of. I shouldn't say too much since I've not looked in depth at what that does.
09:10 Bmagic ok
09:11 Dyrcona As for touching deleted bibs, that depends on what you want staff to see.
09:11 Bmagic It looks like the whole rest of the beta script is creating functions
09:11 csharp Bmagic: http://git.evergreen-ils.org/?p=eve​rgreen/pines.git;a=commit;h=99d5c11​64f7180117760d48245db54a5a7bc0296 and http://git.evergreen-ils.org/?p=eve​rgreen/pines.git;a=commit;h=826d5ba​27d47123982a9fc06b44847cca125c24e helped me
09:11 Dyrcona Yeah. I  often move things around in my own custom upgrade scripts.
09:12 csharp arguably, both of those shoulda been default :-)
09:12 Bmagic csharp: you notice an improvement in speed?
09:12 csharp Bmagic: oh yeah - upgrade (minus reingest) took just over an hour
09:12 Dyrcona He should. :)
09:12 csharp JBoyer deserves all the karma though ;-)
09:12 csharp JBoyer++
09:12 Bmagic I'm worried about the upgrade time, and that we can predict how long so we don't ask people to close longer than they have to, or worse, expect it to be done sooner and it's not
09:13 csharp Bmagic: exactly - after those changes, I'm comfortable predicting an overnight/1.5 day downtime
09:13 Dyrcona Bmagic: You have a database server where you can test?
09:13 Bmagic yep
09:13 Dyrcona Well, there you go!
09:13 Bmagic I'm on similar hardware for test
09:14 Bmagic now I am debating killing it and starting it over.....
09:14 csharp we do our upgrades over holiday weekends (Monday holidays), so I just say down on Saturday evening, back Tuesday morning
09:14 Bmagic it's in a transaction right now, so that should reverse everything back to before the start of the script
09:14 csharp Bmagic: correct
09:14 Dyrcona We're upgrading to 2.12 on Columbus Day weekend.
09:14 Bmagic JBoyer++
09:14 csharp Bmagic: worth stopping it, dropping triggers/exclude deleted, then re running
09:15 csharp probably faster than waiting, even at this point
09:15 Bmagic yeah, and we can run the same command for  deleted later
09:15 Dyrcona Sometimes reloading the db from a dump and doing the script over with triggers disabled is faster than waiting. :)
09:15 csharp yarp
09:16 Bmagic you have convienced me. CTRL+C happened
09:17 gmcharlt a wild CtrlC appears! </pokemon>
09:17 gmcharlt (also, worst Pokemon if unexpected)
09:21 Bmagic lol
09:27 Bmagic does a reingest double as a vis_attr_vector = assignment?
09:31 yboston joined #evergreen
09:38 dbwells Bmagic: I don't think so.  I believe the record_entry visibility trigger only cares about changes in deleted-ness.
09:52 csharp PINES A/T update: working without errors since re-enabling parallel processing
09:53 csharp still haven't ferreted out the acq a/t issue, but installed the stock PO JEDI template and waiting for our acq person to create test POs
09:53 mmorgan csharp++
09:54 roycroft joined #evergreen
09:56 berick csharp: interesting, re: parallel.  thanks for updating.
10:03 gmcharlt draft of a list of all individual contributors to the web staff client: https://gitlab.com/snippets/1676297
10:03 gmcharlt this is derived from Git + adding a couple folks I know who's contribution isn't directly reflected in the git log
10:04 gmcharlt this is just a starting point, so additions welcome, either via comments on the snippet or just letting me know
10:04 gmcharlt and to emphasize, this is for contributions to the web staff client since its inception, not just contributions in 3.
10:04 gmcharlt *3.0
10:18 csharp <homer>woohoo! I made the list!</homer>
10:20 csharp okay, well, the stock PO JEDI template and stock environment still results in "error" :-(
10:20 berick that's good news
10:21 * csharp weeps at that good news
10:22 berick csharp: remind me what this error is?
10:22 csharp Can't call method "class_name" on an undefined value at /usr/local/share/perl/5.18.2/Open​ILS/Application/Trigger/Event.pm line 594.
10:22 csharp basically, there's some bit of data that's not there but the reactor expects to be there
10:23 csharp https://pastebin.com/raw/CHpP91fn shows the problem (added debugging to output $context) during processing
10:24 csharp (in Event.pm just before the erroring line)
10:24 berick csharp: could you add some stuff to that log line?
10:24 csharp sure
10:25 berick add step=$step and path=@$path
10:26 Bmagic csharp: Did I see a bug from you about default schema assignments, My DB doesn't have public.maintain_control_numbers - because it's in schema evergreen
10:28 csharp Bmagic: yeppers: https://launchpad.net/bugs/1714026
10:28 pinesol_green Launchpad bug 1714026 in Evergreen "Maintain Control Numbers function should be schema-qualified" [High,Confirmed]
10:28 Bmagic which schema is it supposed to be in? evergreen?
10:29 bshum Depends on how your search_path is set, I suppose, with unqualified function name.
10:30 jvwoolf joined #evergreen
10:30 berick csharp: here's another thought.  the PO you're testing, can you confirm all acq.lineitem_detail.eg_copy_id values refer to copies that actually exist?
10:30 berick the logs stop right where one might expect a copy to be
10:35 csharp berick: https://pastebin.com/raw/hfkVLTN2
10:35 csharp berick: will check
10:36 berick ok, in both cases, it's trying to reference an asset.copy
10:36 csharp Bmagic: my patch explicitly sets it to "evergreen", but that's kinda arbitrary as bshum suggests
10:36 Bmagic I see
10:39 csharp berick: I can confirm that the copy exists
10:39 csharp (single item PO)
10:40 berick csharp: ok, more logging...
10:41 csharp berick++ # helpin'
10:41 berick csharp: i assume you are logging at INFO level?
10:41 csharp berick: yes, because debug was creating 1.5G files ;-)
10:42 berick can you change the $logger->debug line (around 619) to $logger->info()
10:42 berick and..
10:42 mmorgan csharp: Anything odd about the copy's location?
10:43 csharp mmorgan: nope - looks normal to me - anything you're thinking about?
10:43 * mmorgan is always suspicious about things like punctuation, etc.
10:44 csharp mmorgan: yeah - the location name is "ON ORDER" and we've had weird punctuation problems in acq, but only at the actual EDI translation level, never with A/T before
10:45 berick csharp: ok.. another question.  before the first null context and after the previous context= log line (that looks OK) do you see a cstore retrieval call for a copy?
10:46 miker (I'm not here, but...) re csharp's trigger disabling patches from earlier, that's another place where setting the relocation role would be in order, if you're into that sort of thing ;)
10:47 berick csharp: in other words, right before this line, there should be a cstore call: 2017-09-22 10:31:08 utility03 open-ils.trigger: [INFO:26620:Event.pm:594:] $context = , $step = call_number, @$path =
10:47 miker er, replication
10:49 rlefaive joined #evergreen
10:49 * Dyrcona wonders if setting the replication role would mess with configured replication.... Will check documentation later.
10:50 gmcharlt yeah, it won't break native Pg replication
10:50 gmcharlt but will do a number on Slony
10:52 csharp berick: https://pastebin.com/raw/2i4rnuEi
10:53 csharp other A/T stuff fired off right as I did that, so hopefully there's not noise :-/
10:53 collum joined #evergreen
10:54 berick hm, yeah, a lot of that is unrelated.  no worries.
10:54 csharp is there an advantage of setting the role over dropping/re-enabling triggers (aside from not having to monkey around in the upgrade script)?
10:54 csharp berick: I can give you a lot more context - hundreds and hundreds of lines of it ;-)
10:55 berick csharp: i'd look if you shared
10:55 * csharp wishes he'd gotten it done before circ notices kicked off
10:55 gmcharlt csharp: less typing, for one
10:56 gmcharlt in other contexts, it can also be used as a way to help manage certain kinds of batch updates without having to take the system down
10:57 gmcharlt in particular, by avoiding potential contention from the AccessExclusiveLocks needed to run an ALTER TABLE
10:58 csharp berick: http://evergreen-ils.org/~csharp/logstuff.gz
10:58 csharp about 1100 lines
10:58 csharp gmcharlt: thanks
10:58 berick csharp++
11:00 csharp berick: and I have another test PO ready to run with more logging, but I'll wait for you to review that
11:00 berick csharp: please run
11:00 csharp k
11:01 gmcharlt "Leeeeeroy Jeennnnn... oh wait, not that kind of running"
11:01 berick run, csharp, run!
11:01 csharp arg - can't reload the perl file while the other A/T is running
11:02 berick well, with souther accent more like "c-shawarp"
11:02 berick hah
11:03 Dyrcona "Chree-us!"
11:03 berick csharp: while you wait, do all of the lineitem_details's have uniqe eg_copy_id's ?  there's no overlap?
11:04 csharp in the cases I'm testing today, this are single lineitems with one copy
11:04 berick ah, ok
11:04 berick that makes it cleaner
11:05 csharp I can look at older ones that failed though (didn't have the logging in place then at that point)
11:07 csharp berick: no overlap on previous orders
11:07 berick csharp: can you confirm IDL class "acqlid", <link field="eg_copy_id" has a reltype of "has_a" ?
11:08 rfrasur joined #evergreen
11:08 csharp hmmm no
11:08 csharp <link field="eg_copy_id" reltype="might_have" key="id" map="" class="acp"/>
11:08 csharp well, damn
11:08 bshum Dun, dun, DUN
11:09 csharp I wonder if I changed that for reports
11:09 * csharp hops in the batmobile and runs off to check pines git repo
11:11 csharp bug 1702489
11:11 pinesol_green Launchpad bug 1702489 in Evergreen "Wrong Join Type In Acq Lineitem Detail Causes Inaccurate Reports" [Low,New] https://launchpad.net/bugs/1702489 - Assigned to Chris Sharp (chrissharp123)
11:12 csharp another dichotomy between reports and production use
11:12 berick csharp: you up for testing with the original reltype to confirm it's an issue?
11:12 csharp berick: absolutely
11:12 berick i'm just pointing out stuff that looks unusual
11:13 csharp I'll need to let these circ notices finish
11:13 csharp "I'mma let them finish"
11:14 berick "but hold notices are the best notices"
11:14 csharp kind of between test servers right now
11:14 csharp heh
11:14 csharp I suspect this is the reason for it though
11:15 csharp "return rows even if null"
11:18 csharp right - this makes sense - I changed it in the reports version of fm_IDL.xml, but not the main one before
11:19 csharp I think we have enough divergences between usage to consider leveraging the 2 separate files into 2 separate uses
11:19 csharp not sure if that was the original intent
11:19 berick it was not
11:19 berick one is just a web/locale-friendly version
11:19 * csharp figured it wasn't ;-0
11:19 csharp ah, ok
11:20 berick assuming that's the issue, A/T could be made smarter
11:20 csharp yeah, thought of that too
11:20 csharp I've got Tiffany creating me a new PO on a test server I've reverted that change on
11:22 csharp Dyrcona thought of fieldmapper right off
11:22 csharp since it was syntactically correct, I moved right past it
11:25 rlefaive joined #evergreen
11:32 csharp berick++ # works
11:32 berick *phew*
11:32 csharp seriously
11:33 csharp I've been sweating bullets
11:33 csharp wow - ok
11:47 jihpringle joined #evergreen
11:55 csharp okay - confirmed working in production
11:55 * csharp exhales
11:56 mmorgan csharp++
11:56 mmorgan berick++
12:08 csharp okay, so turning my attention to the catalogers' network error issues...
12:08 csharp twice I've seen reports of network errors from open-ils.search open-ils.search.biblio.record.m​ods_slim.retrieve.authoritative
12:09 csharp but when I run that from srfsh it comes right back and osrfsys logs aren't showing anything weird from those searches
12:10 csharp (for those watching at home, request open-ils.search open-ils.search.biblio.record.m​ods_slim.retrieve.authoritative "2317087" where "231087" is the bib ID
12:10 mmorgan csharp: I have seen that also.
12:10 csharp 2317087 that is
12:10 csharp mmorgan: good to have confirmation
12:11 berick you don't have the logs from the original errors, I take it?
12:11 csharp well, I have a report from this morning and I see three searches that correspond with that bib
12:12 mmorgan I see entries like this in the gateway logs:
12:12 mmorgan [INFO:23766:osrf_app_sessio​n.c:394:15060886412376613] Returning NULL from app_request_recv after timeout: open-ils.search.biblio.record.m​ods_slim.retrieve.authoritative ["4047886"]
12:14 berick mmorgan: could you post the entire osrfsys log, grepping on 15060886412376613 ?
12:14 csharp https://pastebin.com/FG69B5bR - the rest look the same
12:15 csharp huh - I don't see the timeout message
12:15 csharp let me look again
12:15 csharp yeah, that's not showing up for me
12:16 berick check open-ils.search.stderr_log too
12:16 berick or whatever it's called
12:16 berick on the brick in question
12:16 pastebot "mmorgan" at 64.57.241.14 pasted "log" (6 lines) at http://paste.evergreen-ils.org/849
12:18 csharp hmm still not there
12:18 csharp yeah, mmorgan's paste matches my logs
12:18 csharp maybe gateway logs?
12:19 csharp bingo
12:19 mmorgan Yup, gateway logs.
12:20 berick not much to go on there.  seems like it's dying within the mods parsing.
12:20 berick quietly
12:20 berick anyone see anyting in open-ils.search_stderr.log ?
12:20 mmorgan but not all the time.
12:20 berick sometimes errors sneak though and only go to stderr log
12:22 csharp berick: mine's full of messages where we cancel long-running bib searches - I'll dig though
12:22 berick yeah, it can be a pain
12:22 berick lots of 'grep -v <normal stuff>' involved
12:26 mmorgan Hmm. Don't see *stderr.log, would it be in osrferror.log?
12:26 berick mmorgan: no, it's a different file
12:26 berick it's at /openils/var/log/open-ils.search_stderr.log by default
12:27 berick on the brick/server that's running the open-ils.search service (not a shared syslog server)
12:27 mmorgan Ah. Ok, will take a look there.
12:28 csharp hmm - we should really fix those so they go somewhere central
12:29 * csharp tosses that on humongous pile of Evergreen TODOs
12:29 berick we tried.. there's an opensrf patch to redirect all stderr data to the chosen log file.
12:29 berick stuff still leaks through, though
12:31 * mmorgan can't find anything with the threadtrace or record number in stderr.log on that brick.
12:31 berick mmorgan: it may not be that easy :(
12:31 berick sometimes you just get exceptions with little corresponding data
12:32 csharp no dice
12:32 berick if it's happening regularly, one thing I do is: date >> open-ils.search_stderr.log
12:33 berick then wait for it to happen again.  then I have a starting point
12:33 berick csharp: hm, ok
12:33 mmorgan So just look for something that happened at the same time.
12:33 berick mmorgan: yeah, as much as possible
12:34 csharp mmorgan: that was my approach - interestingly, the timestamps output by postgres are apparently UTC
12:34 berick hm, I was wrong before about it dying in the search service...  open-ils.search never gets the record
12:34 csharp so mathing was involved
12:34 berick search -> cstore query -> cstore returns -> <all stop>
12:35 berick oh wait, but search does get the result in csharp's logs
12:35 berick or at least logs the api call duration
12:36 csharp berick: that may not have been one of the timeouts
12:36 csharp this is an intermittent problem, at least as reported to me
12:36 berick csharp: oh
12:36 berick ok
12:36 berick in mmorgan's log, search never receives the bib record from cstore
12:36 csharp let me post one that *is* an issue
12:38 berick since we're talking about xml data, i can't help but wonder if the latest osrf patches are incomplete -- they were directly addressing issues with sending xml (and similer) data via jabber
12:39 Dyrcona Yay for infrastructure changes at the next to last minute!
12:39 * Dyrcona was a part of that, so.... :)
12:40 * berick too for better or worse
12:40 csharp https://pastebin.com/83v5HXZA
12:41 rlefaive joined #evergreen
12:41 berick heh, that's annoying.  log trace collision.  same result, though.  open-ils.search never gets the result from cstore
12:42 csharp http://gapines.org/eg/opac/record/5876106 is the record involved - nothing special about it
12:42 berick the result being a biblio.record_entry
12:42 berick all plump with xml
12:43 gmcharlt but tiny amounts of XML, in that case
12:43 csharp oh, I didn't notice the actor.usr_message stuff :-/
12:43 berick gmcharlt: indeed
12:44 * mmorgan has often seen irrelevant stuff like the actor.usr_message entries here when staring at logs for this issue.
12:45 berick csharp: no reports of this happening for patron catalog views?
12:45 berick ditto mmorgan
12:46 berick well, hmm, may be different.. it uses the unapi.* stuff -- different API call.
12:46 berick but still lots of xml there too
12:48 Dyrcona I was getting crashes with staff search in both the xul client and web staff client but not the patron opac at one point before one of the patches went in.
12:48 gmcharlt berick: thinking aloud - wonder if problem is bundling, not chunking
12:48 Dyrcona By crash I mean, sometimes an Internal Server Error, and sometimes a blank list of search result.
12:48 gmcharlt i.e., that particular record is notable by being fairly small
12:48 mmorgan We've certainly seen internal server errors and lp 1704396 in the public opac, but this definitely seems more prevalent on the staff side, bib related rather than circ related.
12:49 pinesol_green Launchpad bug 1704396 in Evergreen "Slowness for metarecord and one-hit searches in 2.12" [High,New] https://launchpad.net/bugs/1704396
12:49 berick gmcharlt: yeah...
12:50 gmcharlt (side issue: we really ought to come up with better thread trace generation real soon now)
12:50 * Dyrcona wonders if the issue is again not counting bytes correctly....
12:50 Dyrcona Perl--
12:51 berick gmcharlt: log thread thrace?  yeah, that bugs me too.
12:51 mmorgan So is thread trace collision only a problem when trying to parse through logs?
12:51 berick mmorgan: yeah, log trace has no impact on the appliction
12:52 mmorgan ok.
12:54 * mmorgan is way down in the foothills of the learning curve, but this caught my eye: http://git.evergreen-ils.org/?p=OpenSRF.git;a=b​lob;f=src/perl/lib/OpenSRF/AppSession.pm#l1197
13:02 csharp berick: none that I've heard
13:02 berick *nod*
13:06 csharp berick: in fact, as far as I'm getting the information (filtered through helpdesk tickets, etc.), *only* catalogers are complaining
13:06 berick csharp: i assume patron and staff all share bricks, yeah?
13:06 berick it's not segmented
13:07 * mmorgan is also hearing this from tech services staff
13:07 berick worth noting the response sizes from both mmorgan and csharp logs are a fraction of the max-bundle-size (and a smaller fraction of the max-chunk-size).
13:09 csharp berick: yes, same bricks everywhere
13:09 mmorgan same bricks here, too.
13:09 berick even w/ the jabber xml boilerplate, they should be way too small to be impacted by that.
13:09 berick mmorgan: csharp: thanks
13:10 berick or rather, they are not near any size boundaries.  could be other issues i'm not thinking of.
13:11 miker (still not here...) this feels similar to the stalled search issue that mmorgan knows about. don't have lp number handy
13:12 csharp bug 1704396, right?
13:12 pinesol_green Launchpad bug 1704396 in Evergreen "Slowness for metarecord and one-hit searches in 2.12" [High,New] https://launchpad.net/bugs/1704396
13:12 miker that's it
13:12 * csharp agrees that it feels similar
13:13 * mmorgan does also
13:14 berick yeah.. responses just never arriving
13:20 miker to the tcpdump-mobile! (j/k)
13:28 kmlussier joined #evergreen
13:30 csharp well, since this is our highest priority issue now that acq ordering is back, I'm happy to keep providing log data, experimenting with patches, etc.
13:30 csharp for the moment, I have to go back to installing OpenSRF 3.0.0-alpha/EG 3.0-beta2 on our test cluster
13:52 csharp is there a tags/rel_3_0_beta2 on the way?
13:53 dbwells csharp: there is now :)
13:53 csharp dbwells++
13:55 dbwells csharp: sorry about that, it's there now.  There is somewhat less to it, though, given that the upgrade script is hand-committed and the translation pushes are being done separately.  Still good for tracking, though.
13:56 acautley joined #evergreen
13:59 csharp dbwells++ # thank you!
14:02 * dbwells wonders if undeserved karma is bad for... his karma
14:05 kmlussier gmcharlt++ #List of web client contributors.
14:50 jvwoolf left #evergreen
15:07 Bmagic anyone seen this before
15:07 Bmagic Bad arg length for Socket::pack_sockaddr_in, length is 0, should be 4 at /usr/lib/x86_64-linux-gnu/perl/5.22/Socket.pm
15:09 Bmagic during autogen
15:10 Bmagic nevermind, it was memcached
15:10 Bmagic rubberduck
15:12 berick @praise rubberduck
15:12 * pinesol_green rubberduck goes to eleven
15:12 berick and how
16:58 acautley joined #evergreen
17:00 pinesol_green News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
17:05 mmorgan left #evergreen
17:50 Jillianne joined #evergreen
18:38 acautley joined #evergreen
21:04 dbs heads up for the nginx caching crew with 2.12, the offline session management interface must be suffering from some caching issue as when I create a new offline session, it doesn't show up in the client I used to create the session but does show up in another client
21:04 dbs then when I upload transactions, those transactions don't show up in that client but will show up in a third client... etc
21:10 dbs guessing that it's a cache setting for XUL_OFFLINE_MANAGE_XACTS_CGI
21:11 dbs so /cgi-bin/offline/offline.pl
21:14 dbs yep, expires header is set for now + 1 year. okay, that can be fixed :)
21:19 dbs ExpiresByType text/plain "now"
21:19 dbs should do the trick
21:24 dbs Better is to put ExpiresDefault "now" into the /openils/var/cgi-bin/offline Directory, though

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat