I am in the process of making ePub files out-of-copyright scanned books from Archive.org.
Some problems in the scanned text files was that
1. It has a lot of spaces between words. For example, impossible: im possible
2. It doesn't have table of contents
3.
So I would like to create some kind of program or Regex that would automatically fit together words that has been split, and automatcially creates a table of contents.
How might I do this?