Split config from the PHP script

master
Y 2016-01-29 20:34:08 +01:00
parent 6073c36115
commit 1f47656be8
4 changed files with 52 additions and 39 deletions

1
.gitignore vendored Normal file
View File

@ -0,0 +1 @@
web/paperweb.ini.php

63
README
View File

@ -1,4 +1,4 @@
This program come in two parts:
This program comes in two parts:
* the shell search, that can be used on its own,
* the web interface, that needs the former.
@ -16,7 +16,7 @@ There are several aspects to consider:
* Data from the web frontend must not be trusted. This is handled by "the program" (more on this later).
* The web server as whole must not be granted access to Paperwork's documents.
Web server security is outside the scope of this document. Read Apache or Nginx documentation.
Web server security is outside the scope of this document. Read Apache or Nginx documentation to setup some sort of authentication (password, client certificate…).
Now the web frontend. It needs access to the documents. A naive solution would be to use POSIX ACLs on the documents, so that the web server user is allowed to read them. However, this would allow *any* program running on the web server to read the documents, not just my program. And other programs on the web server might not be constrained by the same authentication rules. For the same reason, I ruled out allowing the web server user to execute random find/grep/awk/etc. commands, let alone a shell command like bash.
@ -28,11 +28,11 @@ Finally, all the input (search terms, etc.) is verified by the CLI script, so th
= INSTALLATION AND DEPENDENCIES =
The script may be copied anywhere. Its dependencies are standard, although I only tested on Debian: bash, find, gawk, sed, grep, file, base64, ls. Besides, if pdfinfo and pdftoppm are installed, the script detects these and handles the pages from PDF documents in an individual manner (as JPEG images), instead of returning the whole document when a single page is requested.
The shell script may be copied anywhere. Its dependencies are standard, although I only tested on Debian and Arch Linux: bash, find, gawk, sed, grep, file, base64, ls. Besides, if pdfinfo and pdftoppm are installed, the script detects these and handles the pages from PDF documents in an individual manner (as JPEG images), instead of returning the whole document when a single page is requested.
Once the script is installed, the path to the Paperwork data must be changed (edit the script). Optionally, the number of DPI (dots per inch) for the extraction of pages from PDF documents may be adjusted as well.
Same goes for the PHP file's installation. It depends on PHP 5.2 or better (so that JSON is included).
Once the PHP file is installed, the path and user for the CLI script must be set (edit the file).
Same goes for the PHP file's installation (```index.php```). It depends on PHP 5.2 or better (so that JSON is included).
Alongside the PHP file, there must be a configuration file named ```paperweb.ini.php```, which should be created by copying and changing the provided example file: ```example-paperweb.ini.php```.
The CSS file is optional, and it can be adapted.
Finally, ```visudo``` must be run so that the web server user is allowed to execute the CLI script, for example:
@ -53,18 +53,8 @@ The program comes with its own documentation (use ```-h```), but here are some e
$ paperfind.sh -Q -d '201512|2016' -i -l 'sosh|yves' | json_pp
[
{
"labels" : [
"facture / taxe",
"Sosh",
"téléphonie",
"Yves"
],
"count" : 11,
"etag" : "1453224546.0000000000",
"folder" : "20160119_1828_14",
"type" : "pdf"
},
{
"count" : 3,
"labels" : [
"facture / taxe",
"Sosh",
@ -72,6 +62,18 @@ $ paperfind.sh -Q -d '201512|2016' -i -l 'sosh|yves' | json_pp
"Yves"
],
"type" : "pdf",
"count" : 11
},
{
"count" : 3,
"type" : "pdf",
"etag" : "1452541168.0000000000",
"labels" : [
"facture / taxe",
"Sosh",
"téléphonie",
"Yves"
],
"folder" : "20151217_0000_01"
}
]
@ -79,20 +81,23 @@ $ paperfind.sh -Q -d '201512|2016' -i -l 'sosh|yves' | json_pp
$ paperfind.sh -T 20151217_0000_01 | json_pp
[
{
"etag" : "1452541166.0000000000",
"mime" : "image/jpeg",
"height" : 212,
"data" : "…base64-encoded data…",
"width" : 149
},
{
"etag" : "1452541167.0000000000",
"mime" : "image/jpeg",
"data" : "...base64-encoded data...",
"width" : 149,
"data" : "…base64-encoded data…",
"height" : 212
},
{
"etag" : "1452541167.0000000000",
"mime" : "image/jpeg",
"data" : "...base64-encoded data...",
"width" : 149,
"height" : 212
},
{
"mime" : "image/jpeg",
"data" : "...base64-encoded data...",
"data" : "…base64-encoded data…",
"width" : 149,
"height" : 212
}
@ -100,10 +105,11 @@ $ paperfind.sh -T 20151217_0000_01 | json_pp
$ paperfind.sh -D 20151217_0000_01 -p 3 | json_pp
{
"width" : 0,
"mime" : "application/pdf",
"height" : 0,
"data" : "...base64-encoded data..."
"etag" : "1452541108.0000000000",
"width" : 745,
"mime" : "image/jpeg",
"data" : "…base64-encoded data…",
"height" : 1053
}
```
@ -111,4 +117,3 @@ $ paperfind.sh -D 20151217_0000_01 -p 3 | json_pp
There is not much to say, appart from the fact that the web server should probably be configured to authenticate users before granting access to the interface. It depends on the value of the data managed by Paperwork.
The current web UI is minimal and very much a prototype. I may improve it, depending on my own needs, or if someone volunteers to contribute ;-)

View File

@ -0,0 +1,12 @@
;<?php exit(); ?>
; The above is to prevent direct access by the HTTP client.
;
; Here are the settings to change:
; PATH is the path to the `paperfind.sh` command.
; USER is the system user who will run this command.
;
; This is an example file. Copy it to “paperweb.ini.php” in the
; same directory as “index.php”, and change the settings there.
PATH = /usr/local/bin/paperfind.sh
USER = pwdataread

View File

@ -12,12 +12,7 @@
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
########## CONFIGURATION ##########
$PATH='/PATH/TO/paperfind.sh';
$USER='USER THAT SUDO WILL RUN paperfind.sh AS';
##### NO CHANGE PAST THIS LINE #####
$CONF = parse_ini_file('paperweb.ini.php');
if (array_key_exists('doDownload', $_REQUEST)) {
$date = @$_REQUEST['date'];
@ -26,7 +21,7 @@ if (array_key_exists('doDownload', $_REQUEST)) {
$pagearg = escapeshellarg($page);
# -M and -R are used instead of -D to avoid storing the data in RAM
$json = exec("sudo -u {$USER} {$PATH} -M {$datearg} -p {$pagearg}");
$json = exec("sudo -u {$CONF['USER']} {$CONF['PATH']} -M {$datearg} -p {$pagearg}");
if ($json) {
$meta = json_decode($json, true);
$knownetag = trim(@$_SERVER['HTTP_IF_NONE_MATCH'], ' "');
@ -37,7 +32,7 @@ if (array_key_exists('doDownload', $_REQUEST)) {
header("Content-Type: {$meta['mime']}");
header("ETag: \"{$meta['etag']}\"");
header("Content-Disposition: inline; filename=\"{$date}_{$page}.{$ext}\"");
passthru("sudo -u {$USER} {$PATH} -R {$date} -p {$page}");
passthru("sudo -u {$CONF['USER']} {$CONF['PATH']} -R {$datearg} -p {$pagearg}");
}
}
} else {
@ -69,7 +64,7 @@ if (array_key_exists('doDownload', $_REQUEST)) {
$labels = escapeshellarg(implode('|', $matches[0]));
preg_match_all($pattern, $_REQUEST['K'], $matches);
$words = escapeshellarg(implode('|', $matches[0]));
$json = exec("sudo -u {$USER} {$PATH} -Q -d {$dates} -l {$labels} -k {$words} -i");
$json = exec("sudo -u {$CONF['USER']} {$CONF['PATH']} -Q -d {$dates} -l {$labels} -k {$words} -i");
unset($_REQUEST['doThumbnails']);
unset($_REQUEST['thumbnailsDone']);
unset($_REQUEST['currentDoc']);
@ -115,7 +110,7 @@ if (array_key_exists('doDownload', $_REQUEST)) {
if (array_key_exists('doThumbnails', $_REQUEST)) {
$date = $_REQUEST['doThumbnails'];
$datearg = escapeshellarg($date);
$json = exec("sudo -u {$USER} {$PATH} -T {$datearg}");
$json = exec("sudo -u {$CONF['USER']} {$CONF['PATH']} -T {$datearg}");
} else {
$json = @$_REQUEST['thumbnailsDone'];
$date = @$_REQUEST['currentDoc'];