Evergreen ILS Website

IRC log for #evergreen, 2017-10-17

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
01:31 Jillianne joined #evergreen
06:01 pinesol_green News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
07:43 dwgreen joined #evergreen
08:05 kmlussier joined #evergreen
08:09 collum joined #evergreen
08:10 collum_ joined #evergreen
08:14 _bott_ joined #evergreen
08:28 Dyrcona joined #evergreen
08:54 mmorgan joined #evergreen
08:58 jvwoolf joined #evergreen
09:14 JBoyer joined #evergreen
09:18 jvwoolf1 joined #evergreen
09:21 yboston joined #evergreen
09:45 collum joined #evergreen
09:50 csharp does webby respect the session timeout YAOUSes?
09:52 mmorgan csharp: I don't believe I tested in webby explicitly, but see lp 1693035
09:52 pinesol_green Launchpad bug 1693035 in Evergreen "Logins not honoring all org unit timeout settings" [Medium,New] https://launchpad.net/bugs/1693035
09:55 csharp mmorgan: perfect - thanks!
10:01 kmlussier csharp: I asked that question in here a few weeks ago. Initially, I was seeing that webby wasn't timing out at all, but the next day, it started to time out based on the YAOUS.
10:02 kmlussier I didn't investigate further because other things came up.
10:03 Dyrcona Webby is open. Maybe someone messed with the timeouts?
10:04 kmlussier Yes, let me correct myself. Not in webby, but in the web client. I did the testing on my own VM, and it wasn't an issue with the settings changing.
10:08 csharp sorry, yes I was using "webby" to mean "the web client"
10:11 kmlussier csharp: Dyrcona was speculating that it could be a cache issue when I reported back my results here - http://irc.evergreen-ils.org/​evergreen/2017-09-29#i_327518
10:12 kmlussier But a couple of days earlier when I first started testing, I believe berick said he noticed it doesn't always work for him.
10:13 Dyrcona If there are errors rendering a template, where would they show up in the logs, if at all?
10:14 berick i confirmed one scenario where it fails to log out.  if the server is no longer responding, as in the case w/ a temp VM I set up yesterday, it get stuck in the middle of the log out dance.  that's an atypical example, but maybe there's something to be learned there.  worth looking for JS console errors around the time it should have logged out...
10:15 Dyrcona Oh, wait! I see the problem. Never mind.
10:15 berick and FWIW, you can enable timestamps in the chrome JS console (config icon -> show timestamps)
10:16 berick useful for linking logs to the passage of auth timeout timeing
10:19 Christineb joined #evergreen
10:26 csharp kmlussier: berick: thanks for the info - good to know
10:26 csharp sounds like tracking this kind of thing is going to be more straightfoward due to built-in browser dev tools
10:26 * berick bumps staff timeout down to 5 minutes, logs back in and watches
10:34 jeff keep in mind that if you go too high with your timeout values, you'll just plain break webby. i can't remember if there's a bug on that. i'll look and create one if not.
10:35 jeff (and i suppose confirm that it's still a bug first)
10:39 berick hm, is there a separate ticket specifically about webby not honoring the timeouts?
10:40 csharp jeff: a number of our libraries set the timeouts to insanely high values during the days of bug 1036318
10:40 pinesol_green Launchpad bug 1036318 in Evergreen 2.4 "OPAC timeout within the client" [Medium,Fix released] https://launchpad.net/bugs/1036318
10:40 mmorgan1 joined #evergreen
10:41 csharp berick: not that I've seen
10:44 csharp seeing lots of no_tz.open-ils.storage.actor.user.crazy_search: prepare_cached(SELECT evergreen.unaccent_and_squash(?)) statement handle DBIx::ContextualFetch::st=HASH(0xd55f878) still Active at /usr/local/share/perl/5.18.2/OpenILS/A​pplication/Storage/Publisher/actor.pm line 627. in the osrfwarn log
10:49 kmlussier berick: I didn't file any mainly because, once I got the timeout the work a couple of days after adding the setting, I didn't continue testing it to determine if my initial problem was indeed a cache issue or if there was a larger problem.
10:51 berick kmlussier: gotcha.  (again, not sure if we need one, just curious)
10:51 * berick looks at the API issue
10:53 jeff csharp: and did those insanely high values then cause problems with webby?
10:54 csharp jeff: dunno yet - I'll report back if so
10:54 csharp relaying back and forth with Terran, who's out training at our libraries
11:04 jeff understood. :-)
11:16 sandbergja joined #evergreen
11:31 Bmagic berick: select * from reporter.simple_record where tcn_value = '2468087'  results in 4 rows. Perhaps too many rows is the issue?
11:32 berick Bmagic: do where id = <whatever>
11:32 Bmagic berick: that can't be right, because I tested one of the barcodes that showed the title in the email and it had 4 rows also
11:33 Bmagic berick: using the id column to match the record from asset.call_number where call_number is asset.copy.call_number, I still get 4 rows
11:34 berick ok, yeah, there should only ever be 1 reporter.simple_record entry per bib ID
11:34 Bmagic strange that the example where the title appeared in the email also has 4 rows
11:35 berick i expect other problems will dissolve when the data is fixed
11:36 Bmagic I see that it's a view of biblio.record_entry joined with metabib.metarecord_source_map and metabib.full_rec
11:37 berick beware the IDL uses reporter.materialized_simple_record
11:37 Bmagic is that what is used in the AT?
11:37 berick yeah
11:38 berick AT walks the IDL to flesh environment objects
11:38 Bmagic so, it's really not reporter.simple_record I should be looking at
11:38 berick right.  s/beware/oh I just remembered/
11:38 Bmagic haha
11:39 Bmagic ok, so it makes more sense that it's a materialized view. The solution might be to truncate the table and re-create?
11:40 Bmagic FUNCTION reporter.refresh_materialized_simple_record()
11:40 Bmagic it looks like that function call could take a few days
11:41 berick yeah, that would take a while.  do you have a lot of dupe recs?
11:41 Bmagic funny you should ask, I did a presentation 2 years ago showing how we have been addressing duplicate records
11:42 Bmagic and we have been deduping bre on a regular basis using this 2-step approach
11:43 berick well i mean multiple occurences of the same bib ID in reporter.materialized_simple_record
11:44 Bmagic berick: once I started querying rmsr - there is only 1 row per bib (of the two that I tested where one is showing the title and the other is not)
11:45 berick ok, good.
11:45 berick that's at leat as it should be
11:45 Bmagic if there is a row there, then the email should have the title?
11:45 berick doesn't explain your issue, but one variable solved for
11:46 Bmagic select id,count(*) from reporter.materialized_simple_record group by 1 having count(*) > 1  0 rows returned
11:47 berick and you confirmed the rmsr rows in question have values for the 'title' column?
11:47 Bmagic sure enough
11:48 berick k.  on to the next theory! (of which I have none at the moment)
11:49 Bmagic lol
11:49 Bmagic is there something wrong with this:  [% circ.target_copy.call_number​.record.simple_record.title %]
11:50 Bmagic the environment setup on the AT is: "target_copy.call_number.record.simple_record" "usr" "circ_lib.billing_address"
11:51 Bmagic perhaps "circ." at the beginning is an issue? It's wrapped in a loop [% FOR circ IN target %]
11:55 berick at a glance that seems fine
11:56 berick you could try sidestepping the issue by using the title (etc.) fetching utility code available to A/T
11:56 Bmagic berick: I wonder if this has something to do with the AT crash that I was troubleshooting some time back. It was a situation where the logs made it look like that one AT failed to function due to a totally different AT having an issue in the template. I wonder if somehow.....
11:56 Bmagic however, the email was emailed and the event was success, so that isn't exactly the same
11:57 berick re: my last comment http://git.evergreen-ils.org/?p=Evergre​en.git;a=blob;f=Open-ILS/src/sql/Pg/950​.data.seed-values.sql;h=152f508105680b2​3abc4a21d5e91ec7957491aef;hb=HEAD#l9050
12:00 _adb joined #evergreen
12:00 bwicksall joined #evergreen
12:05 Bmagic I can try that (I assume you are saying that I could try using this:  [%- copy_details = helpers.get_copy_bib_basics(circ.target_copy.id) -%] and then getting the title via [% copy_details.title %]) ?
12:05 berick yeah
12:06 berick no guarantees, just something else to try if you're out of ideas
12:06 berick might offer more logging opportunities too
12:06 Bmagic I see it resolves to my $mvr = $U->record_to_mvr($copy->call_number->record); ?
12:06 Bmagic is that in 2.11?
12:08 berick elpers.get_copy_bib_basics ? yeah, been around a while
12:09 Bmagic I was just confirming that, ok, thanks! I will try that!
12:09 Bmagic berick++
12:14 mmorgan joined #evergreen
12:32 jihpringle joined #evergreen
12:39 khuckins joined #evergreen
12:43 kmlussier miker: Can you remind me how we got rid of double scrollbars for the catalog records page, advanced search page, etc. in the web client? I'm seeing them again in some systems where the catalog has been customized, and I'm trying to remember how to fix it.
12:45 miker kmlussier: there were a few different things.  first, we added code to the iframe directive (eframe.js, IIRC) that, for some pages, can detect how tall the page is and adjust the iframe height. for some, we just said, "make it 10000px tall", and for others, there's a JS shim that does it now
12:45 miker we used them variously where the different detection methods seemed to work
12:45 kmlussier miker: OK, thanks. That gives us a couple of places to look.
12:46 kmlussier miker++
12:46 miker for the first method, it depends on knowing an element by CSS selector
12:46 miker so if an id or name attr changed, that could make it stop working
12:47 miker but, I think it's all tied into the staff/services/eframe.js code
12:47 miker all three methods
12:48 kmlussier Great! I'll start in that file.
12:49 miker yeah, it's all in $scope.egEmbedFrameLoader
13:08 eady joined #evergreen
13:22 jvwoolf joined #evergreen
13:36 pinesol_green [evergreen|Cesar Velez] LP#1710731 - fix webstaff hold slip and other templates missing call number - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=dae6cbf>
13:44 pinesol_green [evergreen|Cesar Velez] LP#1689325 - require most modals have explicit 'exit' or 'cancel' action inside the modal - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=a769724>
13:44 pinesol_green [evergreen|Cesar Velez] LP#1689325 - correct typo repeated templateUrl modal - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=81015a3>
13:44 pinesol_green [evergreen|Cesar Velez] LP#1721145 - fix Patron Message tab grids missing persist-keys - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=3bd3968>
13:48 pinesol_green [evergreen|Cesar Velez] LP#1714056 - fix for webstaff patron registration not requiring DOB - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=41db0c4>
13:56 pinesol_green [evergreen|Cesar Velez] LP#1715423 - fix issues with the display of IDs in patron summary pane - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=791fa3a>
13:58 pinesol_green [evergreen|Cesar Velez] LP#1714060 -  fixes thinko when obeying patron.password.use_phone setting in patron regctl - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=6ad3acf>
14:06 pinesol_green [evergreen|Cesar Velez] LP#1714566 - enable hold notes set to print on slip to be shown - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=a5a4d91>
14:06 pinesol_green [evergreen|Cesar Velez] LP#1714566 - enable hold notes to display on dialogs - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=8c59459>
14:08 pinesol_green [evergreen|Cesar Velez] LP#1712686 - display completed barcode on copy grids not partial input - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=101c093>
14:26 pastebot "dbwells" at 64.57.241.14 pasted "storage timeout issues - message in stderr log" (1 line) at http://paste.evergreen-ils.org/859
14:26 dbwells Have folks encountered this error?  ^^
14:27 dbwells We're seeing it a fair bit in our storage stderr log.
14:27 dbwells Trying to figure out if it is meaningful.
14:31 dbwells Running on 2.12.
14:32 mmorgan dbwells: I am also seeing those in the storage stderr log, we're on 2.12.4.
14:39 * miker looks
14:41 dbwells Stepping back, here is the issue as it presents.  We have a gateway request that fails intermittently.
14:42 dbwells It looks as follows in the gateway log: open-ils.search open-ils.search.biblio.copy_co​unts.location.summary.retrieve "1707932", 1, 0
14:42 dbwells ...(exactly 60 seconds later)...
14:42 dbwells Returning NULL from app_request_recv after timeout: open-ils.search.biblio.copy_co​unts.location.summary.retrieve ["1707932",1,0]
14:43 dbwells It's a really simple function running a pretty fast query, AFAICT
14:43 jeff is that a particularly thorny record?
14:43 dbwells We see the underlying query in the postgres log (I think).  No errors in osrfsys.log
14:44 dbwells jeff: No, one copy.
14:44 miker so, per docs/development/intro_opensrf.adoc, and a grep of the code, that /only/ happens when opensrf deems a drone to have served it's max requests
14:46 mmorgan dbwells: We also see numerous "Returning NULL..." messages in our gateway logs, not always the same request is failing.
14:47 miker mmorgan: that's a C app saying "I made a remote API request of another opensrf service and it timed out, so I'm giving up"
14:48 * mmorgan nods.
14:48 dbwells Yeah, we see it in a few other cases where things are probably actually timing out, like occasional complex searches.
14:48 miker mmorgan: I'm guessing it's auth you see that from?
14:49 miker dbwells: what's your storage max_requests setting in opensrf.xml?
14:49 dbwells I think the C-app in this case is just osrf_json_gw itself.
14:50 miker ah, ok, that makes sense
14:50 dbwells miker: storage max_requests is at 1000
14:50 pastebot "mmorgan" at 64.57.241.14 pasted "Returning NULL errors" (24 lines) at http://paste.evergreen-ils.org/860
14:50 * miker didn't check to see if the gateways use that code path
14:51 mmorgan That's a sampling from this afternoon.
14:51 miker mmorgan: did you limit it to search, or is it always that app?
14:51 miker (or almost always)
14:52 mmorgan I did not limit the search. Most always that app.
14:52 miker kk, thanks. that's a good data point
14:55 miker dbwells: there must be some some statement handle used in a supporting role in a storage api method that doesn't release it (and, looks like it maybe caches it outside of the method context closure)
14:57 miker although it sorta looks like the "stalled search" thing, if you squint. if it's related, this may help flesh out the sequence of events ... so thanks for the data!
14:57 * miker is looking at that one
14:59 dbwells miker: yeah, I think it might be.  Though I am not convinced the stderr message is really related, its just the only straw I found to grasp.
14:59 * mmorgan must be squinting all the time, thinks every timeout bug is related ;-)
14:59 miker :)
15:00 dbwells miker: basically, if we run open-ils.search open-ils.search.biblio.copy_co​unts.location.summary.retrieve enough, it eventually just doesn't return, no error message.
15:00 miker dbwells: do you usually get that stderr output around that point?
15:01 dbwells miker: I just stumbled on it today, so trying to recreate the error to see...  Will report back once it happens again.
15:01 miker dbwells++
15:02 * miker whishes the stderr handler added a timestamp of some sort... but it's the logger of last resort
15:02 dbwells I am curious if request "lucky number 1000" just doesn't return in some cases, as if the drone is killed before it is done or something...
15:02 mmorgan1 joined #evergreen
15:03 miker it shouldn't, as the network should be autoflushed, and the send() def happens before the check for max_requests... but it's a lead I'm chasing down
15:03 Dyrcona dbwells: Variation on the off-by-one error: Taking a count too soon and acting on it.
15:07 dbwells miker: I am able to recreate the "hang" with enough persistence using open-ils.search.biblio.copy_co​unts.location.summary.retrieve but it doesn't look like the log message is related, unfortunately.
15:07 miker :(
15:07 miker thanks for testing
15:10 miker dbwells: I was going to say, before, the stderr msg looks like a bug of omission, by not closing a helper statement handle that gets cached somewheres ... seems more likely now, after your test
15:11 miker so, likely not a failure-causing bug, just noise
15:11 dbwells yeah
15:16 csharp this call: CALL: open-ils.search open-ils.search.serial.record.bib.retrieve 5621115, 1, 1
15:17 csharp which is made while a record loads, results in a "severe query error" (Empty IN list)
15:18 csharp it comes down to this line: my $orgs = $U->get_org_descendants($ou, $ou_depth); in Open-ILS/src/perlmods/lib/OpenI​LS/Application/Search/Serial.pm
15:18 csharp if you run that with $ou = 1 and $ou_depth = 1, you get no results (empty IN list)
15:19 miker csharp: that's a nonsensical depth param, I believe. the depth for 1 would be 0 (the depth field on ou_type)
15:19 csharp right
15:19 csharp for whatever reason, it's passing 1
15:19 csharp 1, 0 works as expected
15:19 * miker looks for where that's called
15:19 csharp sec...
15:20 csharp Open-ILS/src/perlmods/lib/Open​ILS/WWW/EGCatLoader/Record.pm
15:21 csharp get_mfhd_summaries sub
15:21 csharp called inside "load_record"
15:21 miker yeah
15:22 miker is there a cgi param called copy_depth?
15:22 miker (and if not, is there one called depth?)
15:22 csharp haven't gotten that far :-)
15:22 miker and, for either or both, what are they?
15:22 miker do you have a URL that fails?
15:22 csharp lemme see if I can find one
15:23 csharp started from the server logs and moved backwards
15:25 miker looks like one of at least 3 things: a copy_depth cgi param of 1, a depth cgi param of 1, or the Consortium ou_type has a depth of 1 in the db ... that's the order of likelyhood, I think
15:28 csharp miker: no luck so far
15:30 csharp we can rule out the third possibility off the bat
15:33 csharp I think this may be the search: 2017-10-17 12:40:16 brick01-head osrf_json_gw: [INFO:18380:Search.pm:193:150806323318380210] tpac: site=TRRLS, depth=1, user_query=asvab, query=asvab site(TRRLS) depth(1)
15:34 miker csharp: is TRRLS not a real shortname? that could indeed be it if we fall back to the top of the org tree
15:34 csharp TRRLS is real
15:35 csharp looking for more data on the search
15:35 miker ah, well, that's unlikely to have a context org of 1, ISTM
15:36 csharp could the depth param be persisting when the user changes scope to (in our case) "All PINES Libraries"?
15:36 csharp not sure that's what happened - just imagining possibilities
15:37 miker if they selected the "search all libraries" checkbox? maybe, yeah
15:39 csharp weird - I'll just keep my eyes open for a clear example
15:39 csharp lots of log activity this time of day, obvs
15:40 miker hrm... no, that's not happening
15:40 miker that just sets depth to 0, which is what I'd expect
15:44 miker yeah, not sure how it's getting there ... not even sure how copy_depth might get set, TBH
15:50 mmorgan joined #evergreen
15:54 csharp whatever's causing it, it happens with some frequency, but nothing crazy - 12 p.m. was the top hour with 18 instances of "Empty IN list" (which I think occurs twice in the logs per error occurrence)
15:58 Bmagic Has anyone had a request from a library to have the system automatically delete patrons after a certain amount of inactivity time?
16:01 csharp Bmagic: we mark ours inactive after 3 years of non-activity - we don't delete them though
16:01 csharp but we exclude inactive patrons from our counts
16:02 csharp Bmagic: my query: http://git.evergreen-ils.org/?p=contrib/pines.​git;a=blob;f=sql/set_patrons_inactive.sql;h=a5​06947198f1b219482bc82bc96d72e461c4a95f;hb=HEAD
16:02 Bmagic thanks!
16:02 Bmagic I will suggest that instead of the delete
16:03 csharp Bmagic: we have a policy your libs might want to crib from
16:04 csharp https://pines.georgialibraries.org/dokuwiki​/doku.php?id=circ:accounts:inactive_patrons
16:04 Bmagic yeah, that sounds good!
16:04 Bmagic csharp++
16:05 mmorgan csharp: Does PINES make use of expiration dates at all?
16:07 csharp mmorgan: we expire accounts every two years, but we don't mark them inactive unless they meet the "inactive" criteria
16:08 * csharp runs off to go pick up kid
16:08 mmorgan ok, interesting. Thanks!
16:13 * mmorgan went looking through logs for 'Empty IN list' errors and found the error associated with calls like this:
16:13 mmorgan open-ils.cstore open-ils.cstore.direct.config.​usr_setting_type.search.atomic {"name":[]}
16:16 dbwells miker: Just caught up on reading bug #1704396.  Not sure if my issue is even related to the other search hangs, but is sure seems that way.  The "good" news is a simple srfsh script with 500 open-ils.search.biblio.copy_co​unts.location.summary.retrieve calls is enough to reliably trigger this behavior for us, unfortunately only on production.
16:16 pinesol_green Launchpad bug 1704396 in Evergreen "Slowness for metarecord and one-hit searches in 2.12" [High,Confirmed] https://launchpad.net/bugs/1704396
16:16 dbwells miker: so, I should at least be in a somewhat reasonable position to test patches.
16:19 dbwells miker: I also wonder (without much understanding yet) whether the bug might be in the ".atomic" wrapper.  Logs look like storage responds, but search just keeps waiting for it to finish.  I might try a quick non-atomic copy of open-ils.search.biblio.copy_co​unts.location.summary.retrieve just to see if it passes the srfsh test.
16:30 b_bonner joined #evergreen
16:35 rlefaive joined #evergreen
16:43 dbwells Just anecdotal at this point, but "atomic" in the storage call does not seem to be a factor.
17:05 miker dbwells: thanks! that's one less thing to check early on
17:09 mmorgan left #evergreen
17:11 Jillianne joined #evergreen
17:12 jihpringle joined #evergreen
17:50 b_bonner left #evergreen
17:54 Dyrcona joined #evergreen
18:02 pinesol_green News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
18:23 phooks joined #evergreen
19:26 phooka joined #evergreen
19:30 phooka working on building out new production servers  and  autogen.sh is erroring out while Updating OrgTree
19:31 phooka ERROR:  column i18n_l.rtl does not exist

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat