Skip to content

Commit 0a03305

Browse files
authored
Merge pull request #19 from Szwendacz99/md-html-img-tags-updating
also try updating img tags if url is same as bookstack
2 parents c5cdc64 + 4a47b1e commit 0a03305

File tree

2 files changed

+42
-55
lines changed

2 files changed

+42
-55
lines changed

README.md

Lines changed: 32 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -39,70 +39,52 @@ python exporter.py \
3939

4040
Customization:
4141
```text
42-
usage: exporter.py [-h] [-p PATH] [-t TOKEN_FILE] [-H HOST]
43-
[-f {markdown,plaintext,pdf,html,zip} [{markdown,plaintext,pdf,html,zip} ...]]
44-
[--rate-limit RATE_LIMIT] [-c FORBIDDEN_CHARS [FORBIDDEN_CHARS ...]]
45-
[-u USER_AGENT]
46-
[--additional-headers ADDITIONAL_HEADERS [ADDITIONAL_HEADERS ...]]
47-
[-l {pages,chapters,books} [{pages,chapters,books} ...]]
48-
[--force-update-files] [--images] [--markdown-images]
49-
[--images-dir IMAGES_DIR] [--skip-broken-image-links]
50-
[--dont-export-attachments] [--dont-export-external-attachments]
51-
[-V {debug,info,warning,error}]
52-
53-
BookStack exporter
54-
5542
options:
5643
-h, --help show this help message and exit
57-
-p PATH, --path PATH Path where exported files will be placed.
58-
-t TOKEN_FILE, --token-file TOKEN_FILE
44+
-p, --path PATH Path where exported files will be placed.
45+
-t, --token-file TOKEN_FILE
5946
File containing authorization token in format TOKEN_ID:TOKEN_SECRET
60-
-H HOST, --host HOST Your domain with protocol prefix, example: https://example.com
61-
-f {markdown,plaintext,pdf,html,zip} [{markdown,plaintext,pdf,html,zip} ...], --formats {markdown,plaintext,pdf,html,zip} [{markdown,plaintext,pdf,html,zip} ...]
47+
-H, --host HOST Your domain with protocol prefix, example: https://example.com
48+
-f, --formats {markdown,plaintext,pdf,html,zip} [{markdown,plaintext,pdf,html,zip} ...]
6249
Space separated list of formats to use for export.
6350
--rate-limit RATE_LIMIT
64-
How many api requests can be made in a minute. Default is 180
65-
(BookStack defaults)
66-
-c FORBIDDEN_CHARS [FORBIDDEN_CHARS ...], --forbidden-chars FORBIDDEN_CHARS [FORBIDDEN_CHARS ...]
51+
How many api requests can be made in a minute. Default is 180 (BookStack
52+
defaults)
53+
-c, --forbidden-chars FORBIDDEN_CHARS [FORBIDDEN_CHARS ...]
6754
Space separated list of symbols to be replaced with "_" in filenames.
68-
-u USER_AGENT, --user-agent USER_AGENT
69-
User agent header content. In situations where requests are blocked
70-
because of bad client/unrecognized web browser/etc (like with
71-
CloudFlare tunnels), change to some typical web browser user agent
72-
header.
55+
-u, --user-agent USER_AGENT
56+
User agent header content. In situations where requests are blocked because of
57+
bad client/unrecognized web browser/etc (like with CloudFlare tunnels), change to
58+
some typical web browser user agent header.
7359
--additional-headers ADDITIONAL_HEADERS [ADDITIONAL_HEADERS ...]
74-
List of arbitrary additional HTTP headers to be sent with every HTTP
75-
request. They can override default ones, including Authorization
76-
header. IMPORTANT: these headers are also sent when downloading
77-
external attachments! Don't put here any private data.Example: -u
78-
"Header1: value1" "Header2: value2"
79-
-l {pages,chapters,books} [{pages,chapters,books} ...], --level {pages,chapters,books} [{pages,chapters,books} ...]
60+
List of arbitrary additional HTTP headers to be sent with every HTTP request.
61+
They can override default ones, including Authorization header. IMPORTANT: these
62+
headers are also sent when downloading external attachments! Don't put here any
63+
private data.Example: -u "Header1: value1" "Header2: value2"
64+
-l, --level {pages,chapters,books} [{pages,chapters,books} ...]
8065
Space separated list of levels at which should be export performed.
81-
--force-update-files Set this option to skip checking local files timestamps against remote
82-
last edit timestamps. This will cause overwriting local files, even if
83-
they seem to be already in newest version.
84-
--images Download images and place them in dedicated directory in export path
85-
root, preserving their internal paths
86-
--markdown-images The same as --images, but will also update image links in exported
87-
markdown files (if they are bein exported). Warning: this is
88-
experimental, as API does not provide a way to know what images are
89-
actually on the page. Therefore for markdown data all ']({URL}'
90-
occurences will be replaced with local, relative path to images, and
91-
additionally any '/scaled-\d+-/' regex match will be replaced with '/'
92-
so that scaled images are also displayed
66+
--force-update-files Set this option to skip checking local files timestamps against remote last edit
67+
timestamps. This will cause overwriting local files, even if they seem to be
68+
already in newest version.
69+
--images Download images and place them in dedicated directory in export path root,
70+
preserving their internal paths
71+
--markdown-images The same as --images, but will also update image links in exported markdown files
72+
(if they are bein exported). Warning: this is experimental, as API does not
73+
provide a way to know what images are actually on the page. Therefore for
74+
markdown data all ']({URL}' and '<img src={URL}' occurences will be replaced with
75+
local, relative path to images, and additionally any '/scaled-\d+-/' regex match
76+
will be replaced with '/' so that scaled images are also displayed
9377
--images-dir IMAGES_DIR
94-
When exporting images, they will be organized in directory located at
95-
the same path as exported document. This parameter defines name of
96-
this directory.
78+
When exporting images, they will be organized in directory located at the same
79+
path as exported document. This parameter defines name of this directory.
9780
--skip-broken-image-links
98-
Don't fail and skip downloading images if their url obtained from
99-
images gallery API seem broken (image cannot be downloaded OR fails to
100-
download).
81+
Don't fail and skip downloading images if their url obtained from images gallery
82+
API seem broken (image cannot be downloaded OR fails to download).
10183
--dont-export-attachments
10284
Set this to prevent exporting any attachments.
10385
--dont-export-external-attachments
10486
Set this to prevent exporting external attachments (from links).
105-
-V {debug,info,warning,error}, --log-level {debug,info,warning,error}
87+
-V, --log-level {debug,info,warning,error}
10688
Set verbosity level.
10789
```
10890

exporter.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@
122122
"exported markdown files (if they are bein exported)."
123123
" Warning: this is experimental, as API does not provide a way to "
124124
"know what images are actually on the page. Therefore for markdown data"
125-
" all ']({URL}' occurences will be replaced with local, relative "
125+
" all ']({URL}' and '<img src={URL}' occurences will be replaced with local, relative "
126126
"path to images, and additionally any '/scaled-\\d+-/' regex match"
127127
" will be replaced with '/' so that scaled images are also displayed")
128128
parser.add_argument('--images-dir',
@@ -456,12 +456,17 @@ def update_markdown_image_tags(doc: Node, data: bytes) -> bytes:
456456
levels = doc.parents_levels()
457457
# "](" is a part of markdown image tag, used here to
458458
# try preventing replacing host url in other paces
459-
dir_fallback = ']('
460-
dir_fallback += '../' * levels
459+
dir_fallback_md = ']('
460+
dir_fallback_md += '../' * levels
461+
dir_fallback_md += args.images_dir
462+
dir_fallback_html = '<img src="'
463+
dir_fallback_html += '../' * levels
464+
dir_fallback_html += args.images_dir
461465

462466
host = removesuffix(args.host, '/')
463-
dir_fallback += args.images_dir
464-
data = data.replace(f']({host}'.encode(), dir_fallback.encode())
467+
data = data.replace(f']({host}'.encode(), dir_fallback_md.encode())
468+
data = data.replace(f'<img src="{host}'.encode(), dir_fallback_html.encode())
469+
print("md images")
465470
data_str = re.sub(r'/scaled-\d+-/', r'/', data.decode())
466471
return data_str.encode()
467472

0 commit comments

Comments
 (0)