Skip to content

Conversation

@RayBB
Copy link
Collaborator

@RayBB RayBB commented Aug 2, 2025

This code was always hidden and not displayed anyway.

We use the location field for redirects.
But it used to be that people could set a location (like Miami). That ended long ago. I think this is a relic from that era.

There are currently 2993 authors with a location field pr this duckdb query:

COPY(
    SELECT 
      column1, column4 ->> 'location' AS location
      FROM read_csv('ol_dump_authors_2025-07-31.txt.gz')
      WHERE column4 ->> 'location' IS NOT NULL
      LIMIT 1000000
)
TO 'authors_with_locations.csv';

here they are:
authors_with_locations.csv

Interestingly only 785 include "/author/" in their location which is odd because I've probably merged that many authors alone.

Technical

Testing

Screenshot

Stakeholders

@hornc
Copy link
Collaborator

hornc commented Aug 6, 2025

@RayBB I'm not really aware how the location field on authors is used, and agree it doesn't seem to be visible on the UI, but it is documented in the OL authors schema: https://github.com/internetarchive/openlibrary-client/blob/1ab2acc487877392ee9d7eb7124d21e387a5bafb/olclient/schemata/author.schema.json#L30-L35

and on the /types/ page here: https://openlibrary.org/type/author

So if it is being removed those should be updated too.

@RayBB
Copy link
Collaborator Author

RayBB commented Aug 6, 2025

@hornc once upon a time location was a field where people would put the geographic place that an author lives. For example: https://openlibrary.org/authors/OL487729A/Fernando_Telletxea?_compare=Compare&b=9&a=8&m=diff

I've moved many of those to the description (and deleted others like above). They weren't visible in any way.

However, we still use the location field for redirects like this:
https://openlibrary.org/authors/OL9956482A.json

You're right that we should update the schema and that page as well.
I have created a PR for the schema internetarchive/openlibrary-client#429

Do you know if that /type/author page is used in any way? Like is it just documentation or is it somehow used by infogami or OL? If it's not used I suspect we probably should remove or or just redirect it to the schema you linked above.

@tfmorris
Copy link
Contributor

tfmorris commented Aug 7, 2025

Interestingly only 785 include "/author/" in their location

Those 785 records are broken and won't redirect as you can see by looking at the first one on the list: https://openlibrary.org/authors/OL1274282A

It looks like the Wikidata bot corrupted that particular record, but I haven't checked any of the others:

https://openlibrary.org/authors/OL1274282A/noname?_compare=Comparer&b=5&a=4&m=diff

which is odd because I've probably merged that many authors alone.

There are almost half a million author records which have been merged (and many tens of thousands more which still need merging):

gzcat ol_dump_redirects_2025-06-30.txt.gz | grep -E 'OL\d+A' | wc -l
  479582

@RayBB
Copy link
Collaborator Author

RayBB commented Aug 7, 2025

@tfmorris thanks for digging in. Drini knows about the bot corrupting some records and plans to fix it next week

@tfmorris
Copy link
Contributor

tfmorris commented Aug 7, 2025

Drini knows about the bot corrupting some records and plans to fix it next week

If you link that issue here, it'll save others the trouble of looking it up themselves.

@RayBB
Copy link
Collaborator Author

RayBB commented Aug 7, 2025

@cdrini
Copy link
Collaborator

cdrini commented Aug 11, 2025

This is the bug: #11099 . Otherwise lgtm, thank you @RayBB !

@cdrini cdrini merged commit f5c0ab2 into master Aug 11, 2025
8 checks passed
@cdrini cdrini deleted the remove-unused-author-location branch August 11, 2025 19:57
@tfmorris
Copy link
Contributor

@cdrini thank you for understanding the request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants