-
Notifications
You must be signed in to change notification settings - Fork 0
DEV-667: stage item in repo & index with catalog & full-text #1
Conversation
Almost working - just need to get it to index the output from solr (probably an issue with load_into_solr.sh)
Get it up to date with new imgsrv & slip stuff
|
I think it does not make sense to containerize |
stage-item/Gemfile
Outdated
| gem "marc", "~> 1.2" | ||
| gem "faraday", "~> 2.7" | ||
| gem "faraday-follow_redirects" | ||
| gem "ht-pairtree", git: "../../ht-pairtree" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs to be updated to point to https://github.com/hathitrust/ht-pairtree/ after hathitrust/ht-pairtree#1 is merged
|
This should stay in draft until the following PRs are merged: hathitrust/ht-pairtree#1 |
|
@moseshll I think for you the task (when this is all ready) will be to try to pull it down & run through the instructions (which are not up to date yet) |
|
Regarding DEV-663 and pageturner: There was some support here for pageturner, which this stomps on. For DEV-663 I think it will make sense to re-add the apache support here, using the stuff done for imgsrv where possible, and then clean up some of the docker & apache stuff in imgsrv? Out of scope for now, but should be addressed in DEV-663: the links in the catalog for items don't go to :8888 where the babel apps are running. |
f32bb53 to
f3442b2
Compare
f3442b2 to
5b3d9d9
Compare
trying to get it all working w/ a clean checkout
|
Building the imgsrv apache from here isn't working; the most expedient thing might just be to move that part in here now (since I was thinking of doing that anyway) |
This attempts to reconcile earlier work here with later work in the imgsrv repo and more recent work in this branch. It uses: - nginx for catalog, imgsrv fastcgi, and static files - proxy to apache for cgi So far working: - catalog incl. CSS & JS (via shared checked-out common repo) - imgsrv fcgi - imgsrv cgi I was also able to clone pt & see that it at least attempted it (it had an error about missing GeoIP data)
* Add ssd to checkout list * Enumerate directories to mount for apache - otherwise directories we have in the image (geoip, cache, etc) get masked. Could change this in the future if we move more of the infrastructure directly to this repo rather than relying on checkouts in the parent dir
* ensure usage is actually printed * update ht-pairtree to make sure that namespace dir is created with correct prefix
|
At this point, the following is working:
Still to-do:
Next steps:
|
|
Note that we should not merge this (even if working) until these PRs are merged: hathitrust/ht-pairtree#1 |
|
I also see a complaint from pageturner that it can't read the |
|
Solved the pt metadata issue - I think a combination of not using the branch that uses |
* Don't give instructions to clutter parent dir * Move dockerfile for perl apps here
9b5b88d to
98d9d81
Compare
|
As of right now the catalog links seem to magically be using http and not https... Not sure what's going on |
- ensures data dir is owned by current user - mount log dir outside We could try to use a Docker volume for this, but the problem is that it still wouldn't be owned by the solr user by default. If we were using a Dockerfile instead of mounting config directories in, we would have some other options. There might be other ways to work around this in the future, but this works for now.
Avoids issues with permissions w/ cache, logs, etc
the web server now runs as the user running setup.sh, so cache needs to be writable by that user - it was being created & owned by root
* Fix mount for slip output
|
@moseshll @respinos I was able to run through the instructions in the README without additional fiddling, and I believe everything now works both over ssh and https, and without any permissions weirdness. I think we still need to get these PRs merged: hathitrust/lss_solr_configs#5 Once those are merged I will remove the branches from |
This seems to be some issue with my browser - if I try in a new private/incognito window it's fine. Must be some sort of pinning to localhost to https 😩 |
|
Still unresolved; should track; probably don't need to solve right this minute (but should soon, and will need to for mb, etc; I'll make a note in that issue)
Also @moseshll @carylwyatt when you have a chance it would be good to make sure this all works on ARM without emulation/rigmarole - I think it should but I don't know for sure. |
- build indexer rather than using image (which may not exist) - take pt & ssd off of branches
|
Now just waiting on hathitrust/lss_solr_configs#5 |
|
Superceded by #5 |
A couple questions that are coming up as I work through this:
To what extent should individual things have their own docker-compose files, to what extent should they include dependencies and what should the default practices for port binding be? I think my inclination for the babel apps is to keep it fairly limited - enough to run any tests that might be present, but not replicating the full stack of dependencies for each app, and instead to keep that here.
Where should the apache configuration go? My inclination would be to keep it here rather than in imgsrv, since it will serve for the other CGI applications (or should)
Should we rename https://github.com/hathitrust/imgsrv-sample-data ? Does it make sense to keep it separate, or should we merge it with this repository?
Things are somewhat inconsistent in terms of their reference to
mysql-sdrvsmariadb. For consistency with current production and to minimize pain I think it makes sense to standardize on the existingmysql-sdr(and likewise forsolr-sdr-catalog, perhaps?) but I could be convinced otherwise.