Migration indexes #3536

nichwall · 2024-10-19T18:47:30Z

This PR fixes #3259, #3525, and #3237.

This PR adds migrations for the following indices:

BookAuthor on authorId
BookSeries on seriesId
PodcastEpisode on createdAt and podcastId from Add podcastId index to podcastEpisodes #3528

The author and series indexes reduce query time from multiple seconds/minutes for large databases (more than 20k items) to less than a second. I have not done much testing with large podcast databases yet. I am still investigating why some of the select book queries did not improve too much and whether this can be solved by another index.

To test the difference in query time, I did the following on a moderately sized database so the loop ran in a reasonable amount of time.
Database stats:

Authors: 3217
Series: 947
Books: 5892

Enable benchmark logging for each SQL query
Generated a HAR file of navigating around through the web client to get a variety of SQL requests
Used a combination of Python/bash scripts to:
- Delete and copy database from a backup to start at the same point for all tests
- Start the server
- Run all requests from HAR file
- Stop the server
- Repeat above steps 10 times
- Copy the log file and rename according to index so we can keep all queries for this specific test separate
- Parse the log files to build a table comparing worst time of each query for each data set

I sorted the times by the runtime without indexes, and created the following table (did not include all sets of indexes being added to show best/worst case):

nichwall · 2024-10-19T19:36:38Z

After some more attempts, I have been unable to find additional indices to speed up the selects on each book. I think that is just related to how many columns are being loaded. I looked at adding indices for feeds and adding the titleIgnorePrefix back in, but both of those made all queries slower. I'm not sure what other queries to try adding for these long queries. The longest query of around 600-700 ms is below:

SELECT `book`.`id`, `book`.`title`, `book`.`titleIgnorePrefix`, `book`.`subtitle`, `book`.`publishedYear`, `book`.`publishedDate`, `book`.`publisher`, `book`.`description`, `book`.`isbn`, `book`.`asin`, `book`.`language`, `book`.`explicit`, `book`.`abridged`, `book`.`coverPath`, `book`.`duration`, `book`.`narrators`, `book`.`audioFiles`, `book`.`ebookFile`, `book`.`chapters`, `book`.`tags`, `book`.`genres`, `book`.`createdAt`, `book`.`updatedAt`, `libraryItem`.`id` AS `libraryItem.id`, `libraryItem`.`ino` AS `libraryItem.ino`, `libraryItem`.`path` AS `libraryItem.path`, `libraryItem`.`relPath` AS `libraryItem.relPath`, `libraryItem`.`mediaId` AS `libraryItem.mediaId`, `libraryItem`.`mediaType` AS `libraryItem.mediaType`, `libraryItem`.`isFile` AS `libraryItem.isFile`, `libraryItem`.`isMissing` AS `libraryItem.isMissing`, `libraryItem`.`isInvalid` AS `libraryItem.isInvalid`, `libraryItem`.`mtime` AS `libraryItem.mtime`, `libraryItem`.`ctime` AS `libraryItem.ctime`, `libraryItem`.`birthtime` AS `libraryItem.birthtime`, `libraryItem`.`size` AS `libraryItem.size`, `libraryItem`.`lastScan` AS `libraryItem.lastScan`, `libraryItem`.`lastScanVersion` AS `libraryItem.lastScanVersion`, `libraryItem`.`libraryFiles` AS `libraryItem.libraryFiles`, `libraryItem`.`extraData` AS `libraryItem.extraData`, `libraryItem`.`createdAt` AS `libraryItem.createdAt`, `libraryItem`.`updatedAt` AS `libraryItem.updatedAt`, `libraryItem`.`libraryId` AS `libraryItem.libraryId`, `libraryItem`.`libraryFolderId` AS `libraryItem.libraryFolderId`, `libraryItem->feeds`.`id` AS `libraryItem.feeds.id`, `libraryItem->feeds`.`slug` AS `libraryItem.feeds.slug`, `libraryItem->feeds`.`entityType` AS `libraryItem.feeds.entityType`, `libraryItem->feeds`.`entityId` AS `libraryItem.feeds.entityId`, `libraryItem->feeds`.`entityUpdatedAt` AS `libraryItem.feeds.entityUpdatedAt`, `libraryItem->feeds`.`serverAddress` AS `libraryItem.feeds.serverAddress`, `libraryItem->feeds`.`feedURL` AS `libraryItem.feeds.feedURL`, `libraryItem->feeds`.`imageURL` AS `libraryItem.feeds.imageURL`, `libraryItem->feeds`.`siteURL` AS `libraryItem.feeds.siteURL`, `libraryItem->feeds`.`title` AS `libraryItem.feeds.title`, `libraryItem->feeds`.`description` AS `libraryItem.feeds.description`, `libraryItem->feeds`.`author` AS `libraryItem.feeds.author`, `libraryItem->feeds`.`podcastType` AS `libraryItem.feeds.podcastType`, `libraryItem->feeds`.`language` AS `libraryItem.feeds.language`, `libraryItem->feeds`.`ownerName` AS `libraryItem.feeds.ownerName`, `libraryItem->feeds`.`ownerEmail` AS `libraryItem.feeds.ownerEmail`, `libraryItem->feeds`.`explicit` AS `libraryItem.feeds.explicit`, `libraryItem->feeds`.`preventIndexing` AS `libraryItem.feeds.preventIndexing`, `libraryItem->feeds`.`coverPath` AS `libraryItem.feeds.coverPath`, `libraryItem->feeds`.`createdAt` AS `libraryItem.feeds.createdAt`, `libraryItem->feeds`.`updatedAt` AS `libraryItem.feeds.updatedAt`, `libraryItem->feeds`.`userId` AS `libraryItem.feeds.userId` FROM `books` AS `book` INNER JOIN `libraryItems` AS `libraryItem` ON `book`.`id` = `libraryItem`.`mediaId` AND (`libraryItem`.`libraryId` = 'a210cdb5-cb8d-4ff1-bd87-34eaefffd218' AND `libraryItem`.`mediaType` = 'book') LEFT OUTER JOIN `feeds` AS `libraryItem->feeds` ON `libraryItem`.`id` = `libraryItem->feeds`.`entityId` AND `libraryItem->feeds`.`entityType` = 'libraryItem' ORDER BY titleIgnorePrefix COLLATE NOCASE ASC LIMIT 630, 35;

advplyr · 2024-10-19T20:51:10Z

I think that when we can start improving the API the queries will be simpler and it will be easier to write indexes for them. That data is really helpful, thanks for pulling that.
The only update I had to make here was since I already had the indexes created manually when testing it was crashing so I added a check for them first. This is working well for me.

Thanks!

nichwall added 4 commits October 19, 2024 10:40

Add: migrations for authors, series, and podcast episodes

1fa80e3

Update changelog

ea6882d

Fix: table naming

e8a1ea3

Fix: podcast episode index name

84012d9

nichwall marked this pull request as ready for review October 19, 2024 19:36

Update index creation migration to be idempotent

35e2681

advplyr merged commit 72e59e7 into advplyr:master Oct 19, 2024
5 checks passed

advplyr mentioned this pull request Oct 19, 2024

Add podcastId index to podcastEpisodes #3528

Closed

nichwall deleted the migration_indexes branch October 19, 2024 21:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migration indexes #3536

Migration indexes #3536

nichwall commented Oct 19, 2024

nichwall commented Oct 19, 2024

advplyr commented Oct 19, 2024

Migration indexes #3536

Migration indexes #3536

Conversation

nichwall commented Oct 19, 2024

nichwall commented Oct 19, 2024

advplyr commented Oct 19, 2024