The Microsoft 365 (M365, the Microsoft cloud bundle: email, Word, Excel, Teams, file storage) bill came in at £680 a month last quarter, up from £420 the year before. Most of the increase was storage. The IT lead pulled the usage report and found that SharePoint (the file-and-document side of Microsoft 365) had ballooned to 4.2 terabytes, about 30% of which was a single sub-site called “Old finance archive” that hadn’t been touched since 2022. Nobody was confident about deleting any of it. The contracts inside might still matter, the board minutes definitely did, and the 19,000 PDFs of supplier invoices from 2018 to 2020 probably didn’t, but who’s making that call?
We get versions of this conversation a lot. SharePoint grows and storage costs grow with it. The question of what to archive, what to delete, and what to keep gets pushed down the list every quarter because nobody wants to be the person who deletes the wrong thing. Meanwhile the bill keeps climbing and the search experience gets slower because there’s too much noise in the index.
The four-step plan below is what we use to take that decision off the floor and put it into a structure. It works for tenancies (the Microsoft 365 setup that holds all your users, files and settings) of any size, and it doesn’t require deleting anything until everyone’s comfortable.
Why this matters
Three reasons it’s worth doing properly, in order of how often they bite.
The first is cost. Microsoft 365 storage isn’t free past the included allowance. For Microsoft 365 the included SharePoint pool is the 1 TB base plus 10 GB per licensed user, so a 25-seat tenancy lands around 1.25 TB; once you’re past that, additional storage runs at roughly £0.18 per GB per month. A site that’s grown to 4 TB out of laziness is costing roughly £700 a month it doesn’t need to.
The second is search and findability. SharePoint’s search index works better when there’s less in it. When a user types “current expenses policy” and gets six different versions from 2017, 2019, 2021, and 2024, the answer they need is buried. Archiving the old versions properly makes the live document findable.
The third is compliance. UK retention requirements vary by document type: seven years for HMRC records, six years for contracts, two years for some HR documents, indefinite for some regulatory records. A SharePoint estate that hasn’t been retention-tagged is one that can neither prove it kept the right things long enough nor prove it disposed of the right things on time. Both fail the same audit.
Common failure modes
The patterns we see when SharePoint archiving hasn’t been done deliberately:
- One sub-site per decision. Every reorganisation spun up a new top-level area. The old ones sit there with nobody owning them.
- Finance dictating the layout. The site structure mirrors the chart of accounts because that’s what the head of finance pictured when SharePoint was first set up. The operations team can’t find anything because they don’t think in cost-centre codes (we wrote about this stakeholder mismatch in an earlier post).
- Personal drives full of company data. Half the contracts live in someone’s OneDrive (their personal file space in Microsoft 365) because they were drafted there and never moved. When that person leaves, the contracts leave with them unless someone caught it.
- Versioning chaos. “Contract_FINAL_v2_revised_jb.docx” sits alongside seventeen other near-identical files. The actual current version is the one nobody can find.
- No retention labels. Everything’s treated as keep-forever because nobody set the policies. Storage grows linearly with time.
The four-step plan addresses each of these in order.
The four-step plan
Step 1: Inventory and classify
What this means in practice: list every SharePoint site and document library, note who owns it, how active it is, and what kind of content sits inside.
You can’t archive what you can’t see, so start by pulling a usage report from the M365 admin centre. It’ll tell you every site, its size, its last-modified date, and how many users have touched it in the last 90 days. Anything that’s bigger than 50 GB and hasn’t been touched in 12 months is a candidate. Anything that has no listed owner is a candidate. For each one, name the owner today, even if it’s an educated guess, and book a 15-minute conversation with them about what’s inside.
The classification matters too. There are roughly four buckets: live (in active use), reference (rarely touched but needs to be findable), archive (legally required to keep but not searched), and disposable (no business or legal reason to retain). Every site, every library, slots into one of those.
Step 2: Set retention labels
What this means in practice: tag each content type with how long it needs to be kept and what happens at the end of that period.
Retention labels are M365’s way of doing this properly, but most tenancies we look at have them either not configured or configured once in 2021 and never revisited. The labels you need are simpler than they look. Five or six is enough for most SMEs: HMRC-7-years, Contracts-6-years, HR-2-years, Board-indefinite, General-3-years, Disposable-immediate. Tag the document libraries, not individual files. Files inherit the library’s label by default, and the few exceptions can be re-tagged manually.
This is where the finance-led layout we mentioned causes problems. If the site structure follows the chart of accounts, the retention labels have to follow the document type, and the two don’t line up. Sometimes the right answer is a partial restructure before labels go on. Sometimes it’s easier to live with the friction and over-tag at the library level. Either approach is defensible; the only wrong answer is no labels at all.
Step 3: Move the disposable, archive the reference
What this means in practice: physically separate the live content from the reference content, and clean out the disposable content with a clear audit trail. Don’t delete on day one, move first. Create an archive site (or a dedicated archive library inside each major site, depending on volume) and move anything that’s classified reference or archive into it. The live sites get smaller, faster, and easier to navigate. The archive content is still searchable for users who need it, but it’s out of the daily flow.
Disposable content gets a 60-day grace period: move it to a hold area, notify the owner, give them the chance to flag anything that shouldn’t go. After 60 days with no flags, delete with a logged record of what went and when. That record matters because if someone asks in two years where a 2020 supplier invoice went, you can show that it was reviewed, tagged disposable, given a grace period, and removed.
Step 4: Set up the rhythm
What this means in practice: schedule the same review on a recurring basis so the estate doesn’t drift back.
Archiving isn’t a one-off project, it’s a rhythm. Quarterly review of new sites that have appeared. Annual review of which sites have shifted from live to reference and need to be moved. Six-monthly check that the retention labels are still right; if regulation changes, the labels follow. The rhythm is what stops the same conversation happening in three years’ time.
For most managed clients we hold this rhythm on their behalf. It’s about 90 minutes of work a quarter once the initial cleanup is done.
Where SMEs trip
Two big ones recur.
The first is starting with deletion instead of starting with classification. Somebody decides to “have a clear-out”, picks a site that looks old, and deletes it. Three months later, an HR claim surfaces and the records from that era are needed and gone. The discipline is to move first, then classify, then decide, not the other way round.
The second is letting the structure ossify. SharePoint’s good at letting you build a structure that worked five years ago and is still in place today, even though the business has changed. The owner of the “Old projects” sub-site is the person who joined in 2019 and left in 2023. New people have built parallel structures because the old one didn’t reflect how they worked. The fix is to fold structural review into the quarterly rhythm: not a big-bang reorganisation, but small, ongoing adjustments.
What good looks like
When this is working, the SharePoint estate is roughly half the size it would otherwise be. Search returns the right document on the first hit. Users know where to put new content because the structure reflects current work. Retention labels are applied at the library level and inherited by default. Storage costs are flat or declining. The compliance answer to “show us your retention policy” is a screenshot of the labels and the dates.
The audit trail is the boring win, but it’s the one that matters most when something gets contested.
Where this lands with us
SharePoint archiving sits inside our Managed Services practice. For managed clients we run the inventory, write the retention labels, drive the cleanup, and hold the quarterly rhythm, it’s part of the M365 administration we do anyway. For clients on a self-managed footing we’ll do the initial inventory and retention design and hand it over with a scoping sheet.
Either way, the cost of doing nothing keeps rising. The bill goes up every quarter. The audit you can’t pass costs you a customer contract. The HR claim you can’t defend because the records went missing costs more. And the contracts walking out the door inside someone’s OneDrive don’t show up until the person’s already gone, so tidy beats the alternative every time.
SharePoint usage report looking heavier than it should? Drop us a note at info@jmopartners.co.uk and we’ll run an initial scoping pass.
Want the printable version of this checklist? Drop us a note at info@jmopartners.co.uk and we’ll send it through.
JMO|Partners · Enterprise IT, sized for SMEs.