Skip to main content

Using OpenAI to summarise PDF

To use OpenAI to summarise text from a PDF using Python 3.11.6, you'll first need to extract the text from the PDF and then send it to the OpenAI API for summarisation.

 

Preparation

 

Set-up

pip install python-dotenv langchain openai tiktoken pypdf pymupdf

 

Code

The current code is on my Summaries GitHub page.

 

How to add an environment variable in Ubuntu

To set an environment variable on Ubuntu, can be achieved via a few options.  This depends on whether you want the variable to be system-wide or specific to a user's session.  Here are a couple of more common methods for setting environment variables:

Adding SSL wildcard certificate to Ubuntu running Nginx

Adding an SSL wildcard certificate to an Ubuntu server involves several steps.  A wildcard certificate can secure subdomains of a domain with a single certificate. Here's a general outline of the process:

I'll be using an existing wildcard certificate.

sudo apt update && sudo apt upgrade -y

 

Solving the errors in running Open AI on Ubuntu

While the default version on Ubuntu 20.04 for Python is 3.8, I've added Python 3.11.5 (latest version).

Errors

GPTSimpleVectorIndex is deprecated

Attempting to run python3.11 model-ai.py and I'm seeing the following response

Update Python on Ubuntu

Ubuntu 20.04 comes with Python 3.8 installed.  If you run the update script, you'll be informed that the latest version of Python is running.  But here is the kicker, the actual latest version is currently 3.11.6... see https://www.python.org/downloads/source/

Use the following commands to download the Python 3.11.6 source code

How do you clear caches on Ubuntu?

At first, I attempted

echo 1 > /proc/sys/vm/drop_caches

Response

-bash: /proc/sys/vm/drop_caches: Permission denied

Adding sudo in front of the command was met with the same result.  What about if I execute the shell as root

sudo sh -c 'echo 1 > /proc/sys/vm/drop_caches'

Success.  Caches cleared.

 

flask_debugtoolbar module doesn't exist

Error when running ckan.ini init

from flask_debugtoolbar import DebugToolbarExtension
ModuleNotFoundError: No module named 'flask_debugtoolbar'

Activate your CKAN virtual environment, for example:

. /usr/lib/ckan/default/bin/activate

Then check your location is

cd /usr/lib/ckan/default/src

 

Subscribe to 20.04