Spaces:

giacomov
/

pdffigures2

Runtime error

giacomov commited on Apr 26, 2023

Commit

689a76f

1 Parent(s): 1005fed

Upload 3 files

Files changed (3) hide show

Dockerfile ADDED Viewed

+# read the doc: https://huggingface.co/docs/hub/spaces-sdks-docker
+# you will also find guides on how best to write your Dockerfile
+FROM condaforge/mambaforge:23.1.0-1
+RUN mamba install -y sbt=1.7.1 git gradio
+WORKDIR /work
+COPY data/pdffigures2.jar /work
+COPY app.py /work
+ENTRYPOINT python app.py
+# sbt "runMain org.allenai.pdffigures2.FigureExtractorBatchCli 2304.11968v1.Track_Anything_Segment_Anything_Meets_Videos.pdf -m figures -t 48 -q"

README.md CHANGED Viewed

@@ -6,6 +6,7 @@ colorTo: indigo
 sdk: docker
 pinned: false
 license: apache-2.0
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 sdk: docker
 pinned: false
 license: apache-2.0
+app_port: 7860
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

app.py ADDED Viewed

+import gradio as gr
+import urllib.request
+import subprocess
+import os
+import glob
+def extract_figure(url):
+    # download PDF file from URL
+    urllib.request.urlretrieve(url, "input.pdf")
+    # extract first figure from PDF using pdffigures2
+    subprocess.run(["java", "-jar", "pdffigures2.jar", "input.pdf", "-m", "figures_"])
+    all_pngs = glob.glob("*.png")
+    print(all_pngs)
+    # get path to first figure
+    figure_path = "figures_input-Figure1-1.png"
+    # # read first figure from file
+    # with open(figure_path, "rb") as f:
+    #     figure_bytes = f.read()
+    # # delete downloaded file and figure file
+    # os.remove("input.pdf")
+    # os.remove(figure_path)
+    # return first figure
+    return figure_path
+# define input and output interfaces
+inputs = gr.inputs.Textbox(label="Enter URL of PDF file:")
+outputs = gr.outputs.Image(label="First figure in PDF:", type="filepath")
+# create interface
+interface = gr.Interface(fn=extract_figure, inputs=inputs, outputs=outputs, title="Extract First Figure from PDF", description="Enter the URL of a PDF file and the first figure in the file will be extracted and displayed.")
+# launch interface
+interface.launch()