Accessing Excel Files on Sharepoint With Python

Accessing Excel files on sharepoint with Python

I’ve been diving deep into Python lately. Mostly, I’m trying to learn as much as a can but more importantly, I’ve got a few things I’d really like to accomplish. For instance, I have a client I help with their marketing, web development, and advertising. As part of this I setup a lead tracker in Excel on an office 365 sharepoint page.

In 2020 I just used Excel to do some basic calcuations but this year I wanted to add some more functionality and work a little with Python.

Now, most of you are probably saying, ah no problem, just point your Python script to pull the Excel document from your local files of OneDrive, except, I run Linux. No OneDrive for me.

So I set about to find an API. Truthfully this was actually pretty easy.

To install the sharepy package just do: pip install sharepy

Do some imports:

import os
import sharepy

Logging in is pretty easy….

s = sharepy.connect("TENANT.sharepoint.com")

Running the above will ask for a username and password.'

Now for the headache, and the reason I’m writing this.

Copying any of the links from Sharepoint seems to keep downloading an html file (the web version of the document) instead of the actual document. I need the document to work with pandas.

After some playing around, I found that sharing via People with excisting access gave me a link like this

https://TENENT.sharepoint.com/:x:/r/sites/Marketing/Shared%20Documents/filename.xlsx?d=<bunchOfNumbers>=1&web=1&e=36z69b

As best I understand “d” is an identifier used to locate the file incase you move it later, so the link still works. I removed all of the parameters from the url and was left with a dirtect download link.

https://TENENT.sharepoint.com/:x:/r/sites/Marketing/Shared%20Documents/filename.xlsx

Now I can drop the url into another cell of a jupyter notebook and run that to retrieve the file I need.

r = s.getfile("https://TENANT.sharepoint.com/:x:/r/sites/Marketing/Shared%20Documents/filename.xlsx", filename="raw/leads.xlsx")

Running this should give you a reponse code of 200, meaning everything went well.

In this case, I saved the file in another folder (raw) and under a different name.

Here is all the code:

import os
import sharepy

s = sharepy.connect("TENANT.sharepoint.com")

r = s.getfile("https://TENANT.sharepoint.com/:x:/r/sites/Marketing/Shared%20Documents/MYFILE.xlsx", filename="raw/leads.xlsx")
r