Time |
Nick |
Message |
02:08 |
|
JBoyer joined #evergreen |
03:25 |
|
Bmagic_ joined #evergreen |
03:25 |
|
dluch_ joined #evergreen |
03:28 |
|
pastebot0 joined #evergreen |
03:31 |
|
jeff_ joined #evergreen |
03:34 |
|
dbs joined #evergreen |
06:00 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
07:26 |
|
rjackson_isl_hom joined #evergreen |
08:35 |
|
mmorgan joined #evergreen |
08:45 |
|
mantis joined #evergreen |
09:02 |
|
Dyrcona joined #evergreen |
09:58 |
|
collum joined #evergreen |
10:05 |
|
jvwoolf joined #evergreen |
10:29 |
Dyrcona |
There's something about declaring "package IO::Handle;" and then adding a new IO::Handle method in a Perl script that feels so wrong and so right at the same time. |
10:30 |
gmcharlt |
heh |
10:30 |
gmcharlt |
package Monkey::Patch... |
10:37 |
Dyrcona |
All because I wanted to write $log->log_msg(...) and not log_msg($log, ...) |
10:52 |
|
Bmagic joined #evergreen |
11:00 |
csharp_ |
never seen this one before, but I feel seen today: https://xkcd.com/1205/ |
11:01 |
csharp_ |
(learning git submodules so integrating a third party thing locally will be "easier to manage") |
11:02 |
csharp_ |
also learning about git archive, which I didn't previously know about |
11:03 |
rhamby |
csharp_ the comic doesn't capture the angst of sitting there going "this is a newish thing ... am I going to spend 10 minutes writting this script and never need to do it again and wasting 8 minutes or am I being proactive?" |
11:03 |
csharp_ |
exactly - something along the lines of Pascal's Wager maybe? |
11:03 |
rhamby |
right |
11:11 |
Dyrcona |
Well, I like automating things because automated mistakes usually lend themselves to automated fixes. |
11:12 |
Dyrcona |
csharp_: I work with a project that includes a submodule. If you have questions, let me know. I might be able to help. |
11:13 |
Dyrcona |
So, I'm looking at MARC export, and I think it's a bug that it exports holdings in 852. I think it's supposed to be 952. |
11:31 |
Dyrcona |
Eh, maybe that isn't a bug. I should have looked at the full description again. ;) |
11:33 |
Dyrcona |
Hmm.. The query that I'm working on is going to be more complicated than I first thought... |
11:34 |
Dyrcona |
Or, maybe not. I probably don't have to include locations for deleted copies. |
11:35 |
Dyrcona |
Thanks, rubber ducky! |
11:44 |
mmorgan |
__(')< |
12:02 |
Dyrcona |
Wonder if I messed up, or if there are really that many copies: Record length of 2216371 is larger than the MARC spec allows (99999 bytes). at /usr/share/perl5/MARC/File/USMARC.pm line 314. |
12:03 |
Dyrcona |
mmorgan++ |
12:03 |
Dyrcona |
Guess I'll convert to XML and have a look after it finishes. |
12:05 |
Dyrcona |
Heh. I messed up my query.... |
12:06 |
Dyrcona |
I'm dumping info for all of the libraries copies. |
12:06 |
|
jihpringle joined #evergreen |
12:11 |
Dyrcona |
Well, that should be "all of the library's copies." I forgot to join on asset.call_number where record = ? |
12:13 |
Dyrcona |
Also, I wonder if we should add an OU setting for ISIL codes? |
12:43 |
Dyrcona |
Well, the extract is a lot slower since I added the query to grab the copy locations per bib. |
12:47 |
csharp_ |
Dyrcona: this solved my issue: https://pypi.org/project/git-archive-all/ - basically trying to change up how we deal with our customizations |
12:47 |
csharp_ |
up to now, we pull down the stock tar.gz, uncompress it, and lay our changes on top and I'd rather do all the changes within git |
12:49 |
Dyrcona |
OK. I just make a branch for my customizations and keep it up to date with master, but I guess adding the submodule complicates things? |
12:50 |
csharp_ |
we'll see if it becomes a pain in the ass :-) |
12:50 |
Dyrcona |
Always fun when you get the submodule out of sync with the main code. :) |
12:52 |
Dyrcona |
Here's the project I'm talking about: https://github.com/Dyrcona/openfortigui |
12:54 |
Dyrcona |
Ugh. Looks like this code might be too slow to be useful. |
12:57 |
Dyrcona |
When it was just straight up dumping MARC, it took about 10 to 15 minutes to dump 48,000+ records. It has been running for about 45 minutes now and only dumped 1,296 records with holdings. I should probably modify the main query to return the marc and copy info. |
13:05 |
Dyrcona |
Wonder if I can array_agg over an array_agg? |
13:22 |
Bmagic |
Dyrcona: I've got some perl that dumps records in parallel, I've seen it hit 300 records/second |
13:24 |
Bmagic |
IIRC, 8 threads. Mind you, it's not using the perl "threads" module because Encode.pm. Instead it launches a system command and monitors a mutual file on the fs |
13:47 |
Dyrcona |
Bmagic: I'm not sure doing this in parallel is worth it/possible since I have to write to a single output file. |
13:48 |
Dyrcona |
I don't bother with threads in Perl 5. I use fork. |
13:49 |
Dyrcona |
Anyway, I think I've got a solution. Rather than run this time consuming query once for each record, I'll do it once for all records and make a hash table of the information per record id. |
13:49 |
Dyrcona |
If I get the options right, I can probably have selectall_arrayref make the data structure for me. |
13:53 |
Dyrcona |
My program basically works like this: Get a list of bre.id using one of 3 queries. After that loop through the array of ids and grab the marc for each one. If this is a batch of deletes, set the leader 05 to d and write to the binary output file. Otherwise, delete the 852 tags in the marc, look up the copy location and org unt name for each copy and add a 852 to the marc for each. |
13:53 |
Dyrcona |
Then, write it to the output file. |
13:54 |
Dyrcona |
It got really slow when I added the copy location/org_unit query. |
14:02 |
Bmagic |
I had some of the same challenges |
14:04 |
Dyrcona |
Yeah. Not the first time, not the last. |
14:04 |
Bmagic |
I think I landed on: run query to get the bottom ID number. Start there and alter the query with Limit 100 OFFSET X. Feed that query into a thread, move the next thread and shift OFFSET. Measure the time, if it's faster than last time, then increase LIMIT value. If no results, then recalculate the bottom ID number using OFFSET to make sure I'm not starting at the same bottom |
14:05 |
Bmagic |
worked out all the kinks and it runs pretty smooth now. Kinda neat. The master thread writes a progress file that you can cat during runtime to see how many threads are going, how many records per second each thread is producing, and how many total records per second for the whole operation |
14:08 |
Bmagic |
each thread writes a mrc file into a temp folder and tells the master thread the location of the output. When all is done, the master thread reads all outputs and streams them into a single file, removing the temp files in the process. Then FTP's it to wherever or writes it to the configured hard drive folder, etc. |
14:08 |
Dyrcona |
Yeah. I don't use threads in Perl5. I'd do something like that in Python or C++, though. |
14:09 |
Dyrcona |
You synchronize writes to 1 file with a mutex lock. I just threads would overcomplicate this program. |
14:09 |
Bmagic |
cool, sounds like you got it handled |
14:10 |
Bmagic |
here's my think FWIW https://github.com/mcoia/mobius_evergreen/tree/master/bib_extract |
14:10 |
Bmagic |
thing* |
14:11 |
Dyrcona |
:) |
14:11 |
Dyrcona |
Seems like we're constantly reinventing marc_export.pl. :) |
14:11 |
Bmagic |
yeah, lol, in my defense, I started this in 2013 |
14:19 |
Dyrcona |
Well, I think this is going to be faster. I'll just assume that the entries in my two array_aggs line up. |
14:41 |
Dyrcona |
Yeah. It's a lot faster, now. |
14:48 |
|
jvwoolf joined #evergreen |
14:49 |
Bmagic |
Dyrcona++ |
14:49 |
Bmagic |
excellent |
14:50 |
Bmagic |
open question: anyone had issues with the new bootstrap OPAC (integrated into the staff client) when placing holds for patrons, Specifically copy level holds, not automatically filling in the patron barcode in the hold placement page? |
14:51 |
Bmagic |
I couldn't find a bug report... but we all know how launchpad can be when it comes to search.... |
15:03 |
Dyrcona |
I seem to recall there being bugs about copy holds and the new catalog/staff client, but not sure of the specifics. |
15:21 |
|
jvwoolf joined #evergreen |
15:51 |
|
jvwoolf left #evergreen |
16:01 |
|
jihpringle joined #evergreen |
17:21 |
|
mmorgan left #evergreen |
18:01 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
19:34 |
|
jihpringle joined #evergreen |