Time |
Nick |
Message |
07:07 |
|
rjackson_isl joined #evergreen |
07:11 |
csharp |
Happy SysAdmin Day to all of my sisters and brothers |
07:12 |
csharp |
and others |
07:27 |
|
Dyrcona joined #evergreen |
08:37 |
|
collum joined #evergreen |
08:37 |
|
mmorgan joined #evergreen |
08:43 |
Dyrcona |
mmorgan: Setting old PO JEDI events to complete is a bad idea. It will break the pusher in at least two ways: 1) It will blow at line 142 if the purchase order in the target field no longer exists and 2) it will blow at line 161 if there are no acq.edi_message entries and the template output doesn't exist. |
08:43 |
* Dyrcona |
follows up on a conversation from a few days ago in a different channel. |
08:44 |
Dyrcona |
In fact, messing the state of old events seems like a bad idea, now that I've done it. |
08:47 |
mmorgan |
Dyrcona: Hmm. Good to know. I'm pretty sure I've only messed with the state of notice triggers, have not needed to do anything with PO JEDI events. |
08:49 |
|
yboston joined #evergreen |
08:50 |
Dyrcona |
I plan to start purging old events when I'm back from vacation in August. |
08:50 |
Dyrcona |
I set up the retention intervals and all that last night. |
09:06 |
|
bwillis joined #evergreen |
09:08 |
csharp |
purging events was taking too long for us so we stopped doing it (though maybe berick had a fix for that? not remembering clearly) |
09:09 |
csharp |
but the purge query was running for days |
09:15 |
|
jvwoolf joined #evergreen |
09:18 |
Dyrcona |
Whee, fun with EDI continues: Can't call method "message" on an undefined value at /usr/local/share/perl/5.22.1/OpenILS/Utils/RemoteAccount.pm line 586. |
09:20 |
Dyrcona |
OK, that's a line that I modified to try getting the actual server messages, looks like a certain vendor wouldn't let us connect and the pusher just plows ahead trying to login and PUT files despite there being no FTP connection. |
09:21 |
Dyrcona |
Dumb as a box of chocolates... |
09:21 |
|
nfBurton joined #evergreen |
09:21 |
csharp |
yeah - that mechanism isn't really robust :-/ |
09:22 |
|
yboston joined #evergreen |
09:24 |
Dyrcona |
I know. I have opened Lp bugs on it and plan to fix it if I ever have time after cleaning up the mess and doing other things that are apparently a higher priority, though I can't get to work on them, because I spend half my day resetting PO JEDI events and ORDERS messages. |
09:28 |
Dyrcona |
And, yes, another log line says the vendor is not allowing our "user" with 70-some odd accounts to login because of the brain deadness of the pusher. |
09:28 |
Dyrcona |
Well, it doesn't say why, but we get "User X cannot login." before we get to sending them the password. |
09:39 |
csharp |
Dyrcona: another thing to consider is moving to EDI attributes - we just recently got to a point where all of our acq libraries are using that mechanism |
09:40 |
Dyrcona |
csharp: We're getting there, but that ain't my department. |
09:41 |
csharp |
yeah - I started pushing for that really hard after doing the PO JEDI resets to the point of insanity |
09:41 |
Dyrcona |
Our real problem is the fetcher. I think it makes so many connections that it causes this vendor to block us for a while. |
09:42 |
|
khuckins joined #evergreen |
09:49 |
* Dyrcona |
wrote a Perl script to reset the events and messages for purchase orders. I put it on my pastebin. |
09:51 |
|
sandbergja joined #evergreen |
09:54 |
berick |
csharp: did you do an initial purge to get things sort of under control? |
09:56 |
csharp |
berick: yeah |
09:56 |
berick |
csharp: also, for reasons I don't recall -- likely speed fixes in the heat of battle -- the SQL I ended up using locally is slightly different. note the join and temp table. https://gist.github.com/berick/16a20b3da23c73fabe97575d50d601ae |
09:57 |
csharp |
I was running the purge nightly on a cron, but it started piling up because each query was taking > 72 hours |
09:57 |
berick |
ours only runs a few minutes each night |
09:57 |
berick |
omg |
09:57 |
csharp |
huh |
09:57 |
berick |
do you have the output indexes? |
09:57 |
berick |
on action_trigger.event |
09:58 |
berick |
e.g. "atev_async_output" btree (async_output) |
09:58 |
* berick |
looks for index discrepencies |
09:58 |
csharp |
oh - no I don't have that index |
09:59 |
* mmorgan |
doesn't have that index either |
09:59 |
Dyrcona |
I don't have any indexes on any of the output columns, apparently. |
09:59 |
berick |
huh, those should be in stock |
10:00 |
berick |
yeah, they're in the stock sql |
10:00 |
Dyrcona |
I'm looking with PgAdmin at the moment and it's not showing them. |
10:00 |
berick |
could have been missed in an upgrade |
10:00 |
mmorgan |
Looks like they're new in 3.2.6? |
10:01 |
csharp |
yeah, I see them in the stock def |
10:01 |
berick |
mmorgan: yep |
10:01 |
csharp |
haven't looked at outputs |
10:01 |
csharp |
er.. upgrades, I mean |
10:01 |
mmorgan |
They're in version upgrade 3.2.5-3.2.6, we're on 3.2.4 |
10:01 |
csharp |
oh - ok |
10:01 |
csharp |
yeah we're on 3.2.3 |
10:02 |
csharp |
with selected backports |
10:02 |
Dyrcona |
We're on 3.2.4 with selected backports |
10:02 |
berick |
those indexes will help /a lot/ |
10:02 |
mmorgan |
Ditto on the selected backports :) |
10:02 |
berick |
apparently I also added an index to speed up the purging... |
10:02 |
berick |
"atev_def_state_update_time_idx" btree (event_def, state, update_time) |
10:03 |
Dyrcona |
I also have 3 odd tables in action_trigger: new_environment, new_event_def, and new_params. |
10:03 |
Dyrcona |
berick: That index should probably get bugged on Lp. |
10:04 |
berick |
Dyrcona: agreed |
10:04 |
* berick |
will add the lp shortly |
10:04 |
* csharp |
adds the index |
10:04 |
csharp |
(es) |
10:05 |
csharp |
berick++ |
10:05 |
Dyrcona |
Looks like those odd tables come from something someone did locally a while ago and I can probably drop them. |
10:07 |
Dyrcona |
They are also quite empty. I will dop them on Tuesday night. |
10:14 |
berick |
huh, the (event_def, state, update_time) index was in our DB before my time. |
10:15 |
berick |
though it does look like it would help. maybe i'll wait and see if those indexes solve everyone's slowness first. |
10:32 |
Dyrcona |
I have (event_def, state), but not with update_time. |
10:43 |
|
khuckins_ joined #evergreen |
10:53 |
|
stephengwills joined #evergreen |
11:02 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
11:15 |
|
jvwoolf joined #evergreen |
11:22 |
rjackson_isl |
just discovered that Cricket (phone plan) service provider changed their sms address of preference. Will changes to config.sms_carrier be "seen" without any type of autogen or restarting of services? |
11:34 |
Dyrcona |
I think so. I don't recall anywhere that gets cached by autogen.sh. |
11:35 |
Dyrcona |
Alos, adding those a/t event output indexes made my PO JEDI reset script noticeably faster on my test VM. |
11:35 |
csharp |
cricket-- |
11:36 |
Dyrcona |
Well, cheap wireless in general..... |
11:36 |
csharp |
they've done that several times over the last few years - at one point we just removed them from the public listing |
11:36 |
csharp |
rjackson_isl: no autogen.sh required, but it may be cached by the browser |
11:37 |
jeff |
cricket++ (even though they're just AT&T by another name) |
11:37 |
rjackson_isl |
csharp++ Dyrcona++ - now waiting to see if the change worked :) |
11:38 |
csharp |
jeff: my only experience of them is from complaining patrons regarding Evergreen, thus my decrementing :-) |
11:38 |
Dyrcona |
AT&T isn't AT&T any more... :) |
11:38 |
jeff |
Yeah. I can't fault them for us doing it wrong. :-) |
11:40 |
Dyrcona |
And, I think my a/t runner is faster, too! |
11:40 |
Dyrcona |
Could be that there were only a handful of events to process, though. |
11:57 |
|
sandbergja joined #evergreen |
12:10 |
Dyrcona |
Looking at the action_trigger.purge_events function, it's probably linked_outputs query that's taking so long. It might be faster to do the whole delete in a loop on the output of a select statement. |
12:11 |
Dyrcona |
Better than deleting "where not in ..." |
12:14 |
khuckins_ |
It looks like the latest version of Chrome might be breaking OPAC auto-suggest functionality, is anyone else able to confirm? |
12:25 |
jeff |
is there a literal WHERE NOT IN? |
12:25 |
* jeff |
looks |
12:30 |
jeff |
yeah. i'd try moving the union queries from being in a CTE to being in a subselect and changing the NOT IN to NOT EXISTS. |
12:30 |
* jeff |
thinks |
12:32 |
jeff |
or just three not exists, even. |
12:36 |
|
collum joined #evergreen |
12:37 |
jeff |
well, it makes an improvement selecting from a small sample where there are 5515 orphaned outputs. from 5832.446 to 3600.746 ms. |
12:40 |
phasefx_ |
bshum++ |
12:43 |
jeff |
(adding the missing indexes would also be required to speed up the delete on event_output) |
12:43 |
jeff |
but the DISTINCT + CTE + NOT IN still has potential to slow things down, especially in a larger-than-our db. |
12:44 |
Dyrcona |
I was away for a bit, but yeah. |
12:44 |
nfBurton |
Hey. So I have been trying to add my live data to my development server and can't seem to get opensrf to cooperate. If I create the default DB as part of the Evergreen installation srfsh checks work fine. Restoring from my pg_dump and trying the same thing with srfsh gives me "received Data: 'x'". The request completes successfully but isn't actual |
12:44 |
nfBurton |
ly working, even if I reset the password in the database the srfsh login "Login_failed" is the response. Is there something I may be missing I need to modify? |
12:44 |
Dyrcona |
I think the lack of index on update_time is also a big factor. |
12:45 |
Dyrcona |
nfBurton: How did you reset the password in the database? Also, I'm not sure, exactly, what your problem is. |
12:45 |
Dyrcona |
jeff: I'll play with it some more next week, but after adding the output indexes, the db function finished in 31 minutes for me. |
12:46 |
Dyrcona |
I've been deleting individual events and outputs in functions lately, so I've been getting the output id into a variable, then deleting the event, then deleting the output. |
12:47 |
Dyrcona |
It also deleted 13.6 million ate rows. |
12:52 |
nfBurton |
I just did an update on the row and figured the trigger would take care of it. Or do I need to use a specific function? |
12:52 |
nfBurton |
I've narrowed it down to something to do with the DB |
12:52 |
bshum |
phasefx: It took longer than I thought to work out all the kinks, but I'm glad it's mostly working fine again :) |
12:53 |
Dyrcona |
nfBurton: You need to use a couple of functions, but I've got it into one. |
12:53 |
nfBurton |
Because the default loaded database works fine. It's just when I Drop it, run the Create Database line and reload from the pg_dump that it doesn't work. I can see my OUs but the searches/logins don't seem to work |
12:54 |
Dyrcona |
nfBurton: https://pastebin.com/YTLh7pC9 |
12:54 |
Dyrcona |
nfBurton: Is the dump from the same version of Evergreen? |
12:55 |
nfBurton |
I'm dumping from 3.2.4 to HEAD. I've tried doing the upgrade scripts to match version too |
12:56 |
Dyrcona |
nfBurton: You're doing it wrong. |
12:56 |
nfBurton |
srfsh logs have no errors either |
12:56 |
nfBurton |
Oh? That's why I'm here lol |
12:56 |
Dyrcona |
If you have a complete dump of 3.2.4, then just pg_restore that as a new database, then rung the upgrade scripts. |
12:57 |
Dyrcona |
Don't create the database before hand. Chances are the restore will either drop it or things'll be fubar. |
12:57 |
nfBurton |
oh okay. I've been using psql -d -f |
12:57 |
nfBurton |
with the database and filename of course |
12:58 |
Dyrcona |
That's not likely to work. |
12:58 |
nfBurton |
Okay. Good to know |
12:58 |
jeff |
Dyrcona: "Don't create the database before hand" is contrary to how I usually do it. What experience leads you to make that recommendation? |
13:00 |
Dyrcona |
jeff: I do the following on a weekly basis: /usr/lib/postgresql/9.5/bin/pg_restore -U evergreen -h localhost -C -c -d postgres -j 8 ${dumpfile} |
13:00 |
nfBurton |
su root |
13:00 |
nfBurton |
oops |
13:01 |
Dyrcona |
Note the -C and -c options, to create and drop the database. |
13:02 |
Dyrcona |
I have had problems in the past trying to load dumps into existing databases, particularly when the versions mismatched. It has been several years, so I don't remember all of the details. |
13:03 |
nfBurton |
Thanks. I'm going to try that |
13:03 |
nfBurton |
Also, thanks for the password update function |
13:03 |
nfBurton |
Dyrcona++ |
13:05 |
jeff |
nfBurton: Be aware that that command will overwrite any existing database that has the same name as the database name contained within the dump file. If you're on a cluster with no other databases, that probably isn't a concern, but it's worth a warning. |
13:06 |
Dyrcona |
Also note the specific path to the versioned pg_dump. I'm on a server with multiple clusters of different Pg versions. :) |
13:06 |
nfBurton |
Yeah, this is a stand alone dev server. NBD if I wreck it lol |
13:06 |
Dyrcona |
But, if your database is named evergreen, it's probably what you want to do. |
13:07 |
bshum |
"I'm gonna wreck it!" |
13:07 |
Dyrcona |
I also figured you were preparing for an upgrade. |
13:07 |
nfBurton |
haha preparing for contribution. I just need oodles more data |
13:07 |
Dyrcona |
I need to make time to practice upgrading to 9.6, then to 10, and then to 11. I guess I'll install 12 when it comes out, too. |
13:08 |
Dyrcona |
yeahp. I'm using an old mail server with 8TB of disk space for a development database server. |
13:08 |
Dyrcona |
Mulitple copies of production data hanging around. |
13:09 |
Dyrcona |
BTW, if you want two or more copies of a database, I find its faster to pg_restore the main and then make the clones using create database with the first one as a template. |
13:09 |
jeff |
this was my minor speed-up of the orphaned action_trigger.event_output query: https://gist.github.com/jeff/50e287ddb80b416c15cac6775617884c |
13:10 |
jeff |
by itself that does not speed up the actual delete. |
13:10 |
Dyrcona |
I think it would be faster to pull the whole ate row in a loop, delete from the output table with a coalesce on the output fields, then delete from ate. |
13:10 |
jeff |
but especially in a larger db or with a >5k orphaned outputs situation, it could help further. |
13:11 |
Dyrcona |
But orphaned outputs is a slightly different problem. :) |
13:11 |
Dyrcona |
Though the purge events function looks to solve that problem, too. :) |
13:12 |
jeff |
well, the current purge is a two step delete where step 1 creates a bunch of orphaned output and then step 2 deletes that. :-) |
13:12 |
Dyrcona |
What's fun is I've found ate rows pointing to nonexistent output rows. |
13:12 |
jeff |
(but yes, it has a side effect of also removing outputs that were previously orphaned before step 1 ran) |
13:12 |
Dyrcona |
jeff: Yeahp. |
13:14 |
* Dyrcona |
tries jeff's second query on the reports db to see how many orphaned rows it turns up. |
13:15 |
Dyrcona |
I wonder if a single left join would be faster? |
13:15 |
jeff |
the indexes on action_trigger.event.{template_output,error_output,async_output} should be what speed up the delete most. |
13:16 |
jeff |
Dyrcona: left join where the output id is equal to template_output OR error_output OR async_output then have the delete WHERE restrict to where the joined table row is null? dunno, i'll try! |
13:18 |
Dyrcona |
I'll have to let this go for now. I've got something else that I should make sure I finished this morning. Been jumping from thing to thing. |
13:19 |
jeff |
I am not at all familiar with that feeling. |
13:19 |
* jeff |
ducks |
13:19 |
jeff |
the left join (at least, as I wrote it) is pretty terrible, it turns out. |
13:21 |
Dyrcona |
Yeah, I wondered. The the ORs probably slow it down considerably. |
13:22 |
jeff |
surprisingly terrible! |
13:28 |
jeff |
cancelled after 584388.216 |
13:53 |
|
sandbergja joined #evergreen |
13:59 |
|
yboston joined #evergreen |
14:10 |
|
khuckins joined #evergreen |
14:58 |
Bmagic |
Does SIP for Evergreen allow for "item limits" to be conveyed in the 64 message? |
14:59 |
nfBurton |
Don't believe so. It just blocks when you hit max. But there is no limit communicated |
15:00 |
jeff |
Bmagic: what field are you asking about? |
15:00 |
|
mmorgan left #evergreen |
15:00 |
Bmagic |
bibliotheca seems to think that we should respond in the 64 message the item limit |
15:01 |
Bmagic |
"between the permanent location and valid patron are the listed limits for patron account." |
15:01 |
nfBurton |
I don't believe that is a standard SIP field |
15:01 |
jeff |
Bmagic: for what purpose, and do you know what field you're talking about? |
15:01 |
Bmagic |
"BZ" (if I'm reading this email right) |
15:02 |
Bmagic |
FID_HOLD_ITEMS_LMT is what our code says |
15:02 |
jeff |
okay, hold items limit. |
15:03 |
nfBurton |
Actually, it does exist in the 3M documentation |
15:03 |
nfBurton |
http://multimedia.3m.com/mws/media/355361O/sip2-protocol.pdf |
15:03 |
nfBurton |
But no, our 64 response doesn't include it |
15:03 |
Bmagic |
"not supported" by Evergreen? |
15:05 |
jeff |
SIPServer doesn't populate that field. |
15:05 |
jeff |
How does bibliotheca plan to use it? |
15:05 |
nfBurton |
Also, this tool is amazing for testing SIP if you don't have it https://clcohio.org/sip-testing-tool/ |
15:05 |
Bmagic |
cool, (I sorta knew that because they are complaining about it) |
15:06 |
Bmagic |
awesome tool! I will get that for sure |
15:07 |
Bmagic |
I think they would use this information to artificially block patrons at their limit |
15:11 |
Bmagic |
Evergreen has this issue when using self check machines. Where the patron is allowed to go over their limit by 1. This is because, the self check machine will check the item out and then the patron will be blocked (once the penalties are calculated) |
15:11 |
Bmagic |
it needs to "digest" the patron account once |
15:14 |
Bmagic |
Luckily we don't have many libraries using self-check as primary. We've been getting around it by running a sql update every minute to basically "sync" this specific patron penalty for this specific branch |
15:14 |
jeff |
oh? |
15:14 |
Dyrcona |
I would be difficult to guessitmate a holds limit from Evergreen since it varies depending all kinds of factors. |
15:14 |
Bmagic |
Dyrcona: I agree, I think the problem is complicated |
15:14 |
* Dyrcona |
can't type today. |
15:14 |
Dyrcona |
Why is anyone wanting to use it? |
15:15 |
Bmagic |
Dyrcona: I think they would use this information to artificially block patrons at their limit |
15:15 |
jeff |
Bmagic: are you talking about holds or circs? |
15:15 |
Dyrcona |
That's fine, except we don't know what that limit is because it varies. |
15:15 |
Bmagic |
circs (I know the vendor is asking about the BZ field which is confusing things I htink) |
15:16 |
* jeff |
goes to test something |
15:17 |
Dyrcona |
Send them -1 and see what their software does. :) |
15:19 |
Bmagic |
Dyrcona++ |
15:23 |
jeff |
My testing doesn't reproduce the issue. |
15:25 |
jeff |
I had an account with an item checked out. I changed the account to a profile which is limited to a very small number of items (3). I then checked out another item, bringing it to two. In a new session, I then tried to check out two more items, and I was blocked from checking out the fourth item. |
15:27 |
Bmagic |
hmmm |
15:27 |
Bmagic |
jeff++ |
15:28 |
Bmagic |
maybe we are fixing an issue that isn't there (anymore) |
15:28 |
jeff |
what events are you configured to override in the SIPServer config? |
15:28 |
Bmagic |
COPY_ALERT_MESSAGE COPY_BAD_STATUS COPY_STATUS_MISSING |
15:29 |
jeff |
interesting. and are you able to reproduce the issue, or are you going on hearsay? |
15:30 |
Bmagic |
I reproduced it like 6 years ago |
15:30 |
Bmagic |
never thought about again until now |
15:30 |
Bmagic |
might be time to disable that cronjob :) |
15:35 |
|
sandbergja joined #evergreen |
16:08 |
|
jvwoolf left #evergreen |
16:37 |
|
khuckins joined #evergreen |
23:02 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |