AI in legal practice: what I tried, what broke, and what I built

In short: After a year of testing frontier AI models and legal-specific platforms including Legora and Spellbook, the central problem was not capability but structure. A law firm cannot be built around a model. It needs a method in which AI operates under controlled, auditable workflows, with the lawyer as the final gate. Professional responsibility stays with the lawyer.
I have always been drawn to what new technology can do. I learned to code on a ZX Spectrum at school. I got one of the first mobile phones. I worked in-house through the dot-com boom. I have been blogging about technology regulation since 2010. So when frontier models became genuinely capable of useful legal work, I wanted to see how far AI in legal practice could go.
I spent a year finding out.
The fascination, and what I tried
I started where many lawyers start. I used the most capable general-purpose models I could get my hands on. I tested a range of legal-specific platforms across research, drafting and document review (Legora and Spellbook among them), because the real question was not which tool looked best in a demo, but whether any of them could support a legal method that was supervisable, repeatable and defensible. I also have subscriptions to Lexis and vLex, both of which include AI capabilities: Lexis AI and vLex’s Vincent respectively. I tried those too. In each case the AI interface proved difficult to use within a governed, auditable workflow. The underlying platform content, specifically precedent documents and primary legal research, remained valuable. I use both platforms for that content directly, rather than through their AI wrappers.
I also read the growing literature on the “AI-native law firm”, the idea that you could build a practice directly on top of these tools and let the model carry most of the work.
The early results were striking. Fast answers. Plausible analysis. Draft clauses in seconds. For the first time I could see, in concrete terms, how the economics and rhythm of a regulatory practice might change.
I wanted the “AI-native” thesis to be right.
Where it broke
It took a few months of serious use to see the problem, and the problem was not capability. The models were capable. The legal platforms were capable. What was missing was structure.
Chat history is not an audit trail. A prompt is not a process. When the same question produced materially different answers depending on how it was phrased, or when a model confidently cited a case that did not exist, I could not satisfy myself that the work was repeatable or supervisable. I could not hand it to another lawyer and have them reconstruct what had been done. I could not show a client, a regulator, or my insurers a clean chain of reasoning from instruction to output.
That matters because the professional rules do not change when AI is in the loop. The SRA Principles and the Code of Conduct for Solicitors still apply. The solicitor remains responsible for the service provided, for maintaining competence (Paragraph 3.3), and for keeping client information confidential (Paragraph 6.3). The Law Society’s guidance on generative AI and the SRA’s guidance on AI in the legal market both treat AI use as a supervised capability, not a substitute for judgement.
The “AI-native” thesis is appealing. It is also, as I came to see it, the wrong end of the telescope. If the solicitor carries the professional responsibility, the solicitor has to carry the method too. A model cannot be the place where a firm’s method resides, because that method has to be visible, reviewable and reproducible without the model being present.
Recognising the constraints
Once I stopped asking how capable the model was and started asking how AI fits into a controlled legal method, the constraints fell into place.
Outputs must be usable in client work with minimal rework. Judgement must be visible and controlled. Processes must be auditable and reconstructable. Work must be repeatable across similar matters. The system must be modular, with no dependency on a single model or provider. It must run with low friction in live matters.
And it must continue to function when the AI layer is unavailable. This was not a hypothetical when I was designing the system: models are updated, providers change their terms, and no commercial service comes with a guarantee. Every file in the system is in a standard, human-readable format: Word documents, PDFs, plain text workflow specifications, and an open-standard relational database. If the AI layer stopped working tomorrow, or gave me a result I was not satisfied with, I could open each file directly, complete the work as I always did before, and the underlying work would be intact. The method predates the AI assistance. It will outlast it.
Data protection is part of the constraint set, not an afterthought. Processing client personal data through third-party model APIs engages Article 28 and Article 32 of the UK GDPR on processor terms and security, and Chapter V on international transfers. The ICO’s guidance on AI and data protection is clear that accountability under Article 5(2) of the UK GDPR sits with the controller, whatever the tooling. The choice of tooling is a data protection decision before it is a productivity decision.
Designing my system
The design I arrived at has four layers, each with a defined role.
Layer one: workflow
The first layer is the workflow. The core recurring tasks are each a named workflow: new matter intake, document review, regulatory research, advice memo, data protection impact assessment, article drafting. Each workflow is a specification that tells the model what the task is, what inputs it takes, which primary sources to consult, how the analysis is structured, which quality control checks must run before the output is released, and what the output looks like. Quality control is part of the workflow itself, with checks for legal accuracy, source verification, factual verification, style compliance, and presentation and judgement run on every task. The workflows are versioned in plain text. They encode my method, and they get refined whenever a matter surfaces a gap.
Layer two: data store
The second layer is the structured data store. Clients, matters, contacts, engagement model, fee basis, active workstreams and conflict markers live in a relational database. Workflows read client and matter context from the database at the start of a task, so the model is working against live state rather than whatever I typed into a prompt. Material changes to the data are logged in an audit trail. The database is a standard open-format SQLite file. I can open it without any AI tool, inspect it directly, and run queries from the command line.
Layer three: file output
The third layer is file-based output. Drafts, memos, redlines, letters, policies and board papers are produced in standard office formats and saved to a matter-organised file structure. The output is the work product, not the chat transcript. Files sit in a matter-organised structure with version history that can be inspected, reverted, or handed to a reviewer without the AI layer being present. Finished work passes through a precedent index so it can be retrieved and reused on similar matters.
Layer four: model
The fourth layer is the model. The default today is Claude, but the architecture supports multiple providers through a model adapter: I can direct specific tasks to ChatGPT instead, and locally hosted models are a planned extension as they become capable of the work. A model change is a configuration change, not a redesign. Workflows are plain text specifications, the database is a standard open-format file, outputs are standard office files, and the connectors between them use open interfaces. That matters for vendor risk, for data protection (different models have different processor terms and transfer positions), and for the pace of change in the market.
Alongside the four layers, there is a knowledge base. Completed matters, published articles, and significant analysis are indexed in a structured wiki. A regulatory position established for one client can be retrieved, reviewed, and applied to a related matter. A blog post published today becomes a citable entry in the knowledge base. The wiki is written in plain text and versioned in Git, so it is readable without any AI tool and its growth is cumulative: each piece of work makes the next piece easier.
Governance runs end-to-end. Before use: inputs are defined, documents are loaded, and scope is set. During use: the model’s role is constrained by the workflow, context is controlled, and sources it may consult are listed. After use: outputs are reviewed against the workflow’s quality control checklist. The final gate is always the lawyer. No output reaches a client without passing through that gate, and that gate is a professional judgement, not a software check. No essential work product or institutional knowledge is held only inside the model.
Deploying it
I deployed my system in stages, and deliberately not in client work first.
The first deployments were non-legal, non-regulated: a rebuild of the Bratby Law website, blog post production, and LinkedIn content. That work exercised the architecture, including workflows, database, file outputs and quality control checks, with significantly lower professional exposure if something went wrong. It surfaced the weak points in the design, let me refine the workflows and the checklists, and built my confidence that the method was reproducible rather than accidental.
Only once the architecture had been shaken down on lower-stakes work did I move it into the regulated side of the practice. Regulatory advice, document review, data protection work, and client-facing drafting followed in sequence. The workflows have been in live use across those areas for several months. They get better with use: every matter adds a reviewed precedent, every edge case adds a check to a workflow, and every published article adds an entry to the knowledge base.
The practical effect is not that AI has replaced any part of the method. It has made the method faster and more consistent while leaving every professional responsibility exactly where it was.
Viewpoint
The value in AI in legal practice sits with the lawyer. If that is right, the system has to reflect it. You do not build a law firm around a model. You design a system in which models operate under your control. Tools will change. Models will improve. Providers will come and go. A well-designed operating model will survive those changes, and so will the professional accountability that sits behind it.
The profession is still working through what that means in practice. In my experience, the firms that will fare best are those that treat AI as a capability to be governed rather than a product to be purchased. Governance is not a brake on productivity; it is what makes productivity defensible. And the most important governance check in any AI-assisted legal workflow is the one that was always there: the lawyer reading the output before it goes out the door.
Key sources
SRA Code of Conduct for Solicitors | SRA guidance on generative AI | Law Society: Generative AI report | ICO: Guidance on AI and data protection | UK GDPR
For advice on AI and data governance, data protection for AI-enabled products, or the legal and regulatory aspects of AI deployment, contact Rob Bratby at Bratby Law.
