AI-Generated Code and Open-Source Licences (Ireland)
Steven | TrustYourWebsite · 15 May 2026 · Last updated: May 2026
A Series-A due-diligence solicitor flags two short JavaScript functions in your front-end bundle as substantially similar to a GPL-3.0 file on GitHub. Your developer used Cursor to write the form-validation logic and didn't think about it again. This article walks through whether that exposure is real, who carries it and what to do about it.
How Irish copyright law fits AI-generated code
Irish copyright sits under the Copyright and Related Rights Act 2000 (CRRA 2000, Number 28 of 2000), most recently amended by S.I. No. 567 of 2021, the European Union (Copyright and Related Rights in the Digital Single Market) Regulations 2021. The 2021 regulations transposed Directive (EU) 2019/790, the DSM Directive, into Irish law. That body of statute is the framework an Irish small business operator is held to when a maintainer or a due-diligence solicitor raises a question about AI-suggested code in a public bundle. The Software Directive 2009/24/EC and the Information Society Directive 2001/29/EC continue to apply in parallel, transposed by earlier instruments still in force.
Two provisions in the 2000 Act do most of the work in this area. Section 2(1) defines a literary work as including a computer program, which means a code file attracts copyright on creation without registration and without any deposit step. Section 21(f) addresses computer-generated works, defining the first author as the natural or legal person by whom the arrangements necessary for the creation of the work are undertaken. That section was drafted in 1999 with database engines and procedural generators in mind, and applying it to a 2026 large language model is a matter of statutory interpretation the High Court has not yet ruled on. What it does mean in practice is that the developer who prompts Copilot or Cursor is the most plausible candidate for "the person by whom the arrangements necessary" are taken, not the AI vendor and not the model. Where that developer is acting under a commission, section 23 then assigns first ownership to the commissioner unless the agreement says otherwise.
The DSM transposition matters separately for the upstream training question. The Article 4 text and data mining exception in Directive (EU) 2019/790 is brought into Irish law by the 2021 regulations and permits commercial TDM, including the training of a coding assistant, unless the rightsholder has reserved its rights expressly and in a machine-readable form. Public GitHub repositories do not usually carry a machine-readable opt-out, and the open-source licence terms attached to them are licence grants rather than reservations against TDM. That is the legal framework AI vendors rely on to train on the open-source corpus in the first place, and it sits underneath everything that flows downstream to the operator. An Irish operator's exposure is not on the training side. It is on the redistribution side, where the operator hands the output to visitors as part of a JavaScript bundle.
Enforcement of copyright disputes runs through the Commercial Court list of the High Court when the value or the complexity warrants it. Smaller claims go to the Circuit Court or the District Court. The Intellectual Property Office of Ireland (IPOI) in Kilkenny administers patents and trademarks but does not arbitrate copyright disputes, which are pure court matters. The Competition and Consumer Protection Commission (CCPC) is adjacent and not central: misleading commercial practices that arise from how an AI-built site is sold to consumers can sit within the Consumer Protection Act 2007, but the CCPC does not enforce copyright. For copyright specifically, the live parties are the rightsholder, the operator and the High Court. The Data Protection Commission (DPC) sits to one side as well, because copyright over AI-generated code is not in the DPC's remit, but adjacent breaches around the same site often are.
The Doe v. GitHub litigation in the Northern District of California is the most-cited live case in this area worldwide, but a Californian District Court ruling does not bind an Irish judge. The interpretive weight in Ireland comes from how the High Court would read the Software Directive 2009/24/EC, the Information Society Directive 2001/29/EC and the DSM Directive against the facts. The Court of Justice of the European Union has handled adjacent originality questions in Infopaq International A/S v Danske Dagblades Forening (C-5/08) and Painer v Standard Verlags GmbH (C-145/10), and an Irish court would normally begin with those before considering US persuasive material. The practical implication for an Irish operator is that public-facing risk is governed by Irish-statute remedies, with the open-source licence treated as a contractual instrument under Irish contract law, even when the headline case originates in the United States.
GitHub itself has its EMEA headquarters in Dublin, and a substantial Irish small-business customer base uses Copilot under terms governed by Irish law for European subscribers. The IP indemnification clause in Copilot Business and Enterprise contracts is governed by the same European terms. That is a useful jurisdictional fact: when an Irish operator relies on the indemnification, the enforcing forum and the substantive law applied to the indemnification itself can be Irish rather than Californian. The clause does not change who is sued first by an upset maintainer, but it does change who pays at the end if the indemnification is triggered.
What the AI actually did
Coding assistants like GitHub Copilot, Cursor, Claude and Cody were trained on huge volumes of public source code, including repositories under GPL, MIT, Apache and BSD licences. The training process did not preserve attribution metadata, and the models learned patterns rather than entire files. When a developer prompts the assistant, the model produces an output that is sometimes a novel construction and sometimes a near-identical reproduction of a specific training-data file. The assistant does not warn the developer which is which, and it does not emit a SPDX header or a copyright notice.
That is the technical fact at the bottom of the legal question. The model is not licensed to redistribute training-data code, and the developer is not warned when the output is structurally close to a specific source.
Who is exposed
The site operator distributes the code that ships to visitors. A browser loading your homepage receives the JavaScript bundle. Under GPL and similar copyleft licences, that is distribution to the end user. The operator is the entity making it available, regardless of whether the operator wrote the line of code or the agency did or an AI suggested it.
This is the same liability chain that applies to web-designer-introduced copyright issues. The pre-AI version of the problem is a designer who dropped an unlicensed Getty photo into the carousel. The post-AI version is a developer who accepted a Copilot suggestion that reproduced a GPL source file. The structure is the same. The public-facing party is the operator. The internal cost allocation between operator and agency is contract.
Sitting next to this is the broader question of who pays when AI-built sites break compliance. GDPR enforcement by the Data Protection Commission and EAA enforcement under the Irish transposing regulations flow to the operator on the same principle. Copyright on AI-generated code is the copyright corner of that same map.
What the courts have actually said
The leading case is Doe v. GitHub, Inc., filed November 2022 in the Northern District of California. Anonymous developer plaintiffs sued GitHub, Microsoft and OpenAI over Copilot's training on public open-source code. The procedural posture moves, and the table below is a snapshot as of May 2026. Re-verify before relying on it.
<!-- LAST VERIFIED: 2026-05-15 -->Doe v. GitHub claim-by-claim status, May 2026.
| Claim | Status as of May 2026 | What it means for your site |
|---|---|---|
| DMCA § 1202(b) on removing copyright management information | Dismissed with prejudice, January 2024, for "near-identical" outputs | Plaintiffs would need verbatim reproduction to revive. Risk for SMBs: low on this specific theory. |
| Breach of open-source licence terms (MIT, GPL, Apache and others) | Allowed to proceed | Open-source licences are treated as enforceable contracts. Risk for SMBs: moderate where client-side code distributes the output. |
| Tortious interference and unfair competition | Mixed dispositions, some claims survived | Not directly SMB-relevant. The dispute is between the plaintiffs and the AI provider. |
| Unjust enrichment | Dismissed | Not SMB-relevant. |
A live procedural posture. Re-verify before relying on it.
The headline takeaway is narrow. The court has not yet ruled on the central substantive question of whether AI-generated output substantially similar to training code violates the original licence. What it has done is sorted the claim theories. The technical "removal of copyright management information" route under DMCA § 1202(b) is closed where the output is "near-identical with semantically insignificant variations." The contract route, treating an open-source licence as a binding agreement that the AI provider's use violated, is still live. Procedural updates appear on the BakerHostetler tracker and on the plaintiffs' counsel's case page. The plaintiffs' page is one side's framing and should be treated as such.
GPL distribution and your website
The legal question turns on a technical one. What counts as distributing the code?
GPL-style copyleft licences attach attribution and source-availability duties to anyone who distributes a covered work. Distribution to an end user is the trigger. For a website, this maps to two cases.
Client-side JavaScript that ships to the visitor's browser is distribution. Every page load delivers the bundle to a third party, which is the GPL distribution case. If the bundle contains code that is substantially similar to a GPL-licensed file, attribution and source-availability duties apply.
Server-side code that never leaves your server is generally not GPL distribution. The exception is AGPL, where Section 13 treats network use as distribution. Most SMB sites do not run AGPL-licensed backend code, so the practical exposure is concentrated in the client-side bundle: form validation, animations, modals, helper utilities, the kind of small functions a developer asks an AI to write.
This is why the AI-code question matters more for the front end than for the back end of your site. A WordPress plugin that uses Copilot-suggested PHP on the server runs at lower exposure for non-AGPL code than a React component the assistant wrote that ships to every visitor.
How realistic is the risk
Honest probability hierarchy, in order from most to least likely.
The first realistic scenario is an investor or acquirer running due diligence on your codebase before a funding round or an exit. Their lawyers run a licence scanner like FOSSA, ScanCode or licensee. If the scanner flags GPL-licensed code in a proprietary product, the deal-team asks questions. The outcome is usually a remediation budget and a delay, not a killed deal. This is the most common way SMBs find out they have a problem.
The second is an open-source maintainer noticing their code in your public bundle. Larger projects have community members who watch for unattributed reuse. The first contact is a polite email asking for attribution. Escalation looks like a DMCA takedown sent to your host, which interrupts service until you respond. Lawsuits at this level are rare for SMBs because the cost of bringing one outweighs the recovery against a small business.
The third is enforcement by a copyleft-licence steward organisation such as the Software Freedom Conservancy. These groups do bring enforcement actions, but their pattern is to engage in long correspondence first and to target hardware vendors or larger software companies. The threshold for an SMB website is high.
In practice, the realistic week-to-week risk for a small business site is zero. The risk concentrates around three moments: a funding round, an acquisition or a maintainer searching the internet for their distinctive function. None of these is likely in a given month, but all are predictable and avoidable.
Practical mitigation if you or your developer use AI tools
Five things to do. None of these is a legal defence and none should be sold to you as one. They are engineering hygiene that reduces the chance the problem ever surfaces.
First, turn on the duplication filter in Copilot, Cursor or any other coding assistant that offers one. The filter blocks suggestions that match training-data code above a similarity threshold. It does not eliminate near-identical output, but it does reduce the worst case. Confirm the setting is on in the developer's actual editor configuration, not just on the team account.
Second, run a licence scanner before deployment. Free tools include licensee, scancode-toolkit and ort (the OSS Review Toolkit). Commercial options include FOSSA, Snyk Licence and Black Duck. The scanner reads your package manifests and your source tree and flags licences that conflict with your distribution model. Running this once on the production bundle is more useful than running it never.
Third, if your developer is on paid Copilot Business or Enterprise, GitHub offers an IP indemnification commitment against third-party claims arising from Copilot output, conditional on the duplication filter being enabled. This is a meaningful contractual backstop, but it is conditional on the filter setting, limited to the named plans and verifiable only against the current terms before relying on it. Free Copilot, Cursor, Claude and Cody do not, as of May 2026, offer equivalent commitments.
Fourth, update your agency contract. Add a clause that the agency will not use AI-assisted code that incorporates GPL or AGPL output without explicit written notice to you, and that the agency warrants the delivered site does not infringe third-party licences. This does not protect you from the maintainer who notices. It does give you a route to push the cost back to the agency if a claim arises.
Fifth, keep a software bill of materials for your client-side bundle. Tools like cyclonedx-bom or the SBOM exports built into modern bundlers list every dependency and its licence. If a question arises in a year, having an SBOM from the release in question saves a week of work.
Our free compliance scan covers GDPR, cookies, accessibility and image rights on the live site. It does not check open-source licence compliance, which is a separate developer-tooling job. Treat the two as parallel tracks on the same site.
What changes on 9 December 2026
Directive (EU) 2024/2853, the new Product Liability Directive, treats software including AI systems as products from 9 December 2026. Ireland must transpose by that date. Article 4 brings AI tools into scope. Article 2(2) excludes open-source software developed outside a commercial activity, so the public open-source maintainer is not the defendant in a PLD claim. The commercial AI vendor is.
The relevance to AI-generated code is narrow. An Irish small business harmed by a defective AI tool, for example where the AI emits code with a security flaw that leads to a data breach with downstream harm to a natural person, may have a new no-fault claim path against the AI vendor under the directive. The claim is for damage to natural persons, and it applies only to products placed on the market after 9 December 2026. The PLD is not a route for the operator to recover a Data Protection Commission fine and it does not retroactively reach pre-existing tools. The Product Liability Directive in depth covers the scope and exclusions.
What this is not
This article is about open-source licence exposure when an AI writes code on your site. Three adjacent topics share words with this one.
Chatbot disclosure and AI-generated marketing copy labelling sit under Article 50 of the AI Act and AI-generated content, which is a separate regime from source-code copyright. The image side, where AI-generated illustrations or photographs may infringe, lives in AI-generated images on your site. The cookie-banner and accessibility version of the same liability chain is the broader GDPR and accessibility question. For the pre-AI image-letter version, see the Getty Images letter guide.
Common Questions
Does Copilot's duplication filter eliminate the risk?
No. The filter reduces the chance of verbatim reproduction of training-data code, which is the worst case. It does not address near-identical output that still resembles a specific open-source file. Treat the filter as risk reduction, not as a legal shield.
Am I liable if my freelancer used Cursor without telling me?
The site operator is the party distributing the code to visitors. An open-source maintainer who notices their code in your bundle writes to the domain owner. Your freelancer may owe you a fix under contract, but the public-facing exposure sits with you.
Does this apply to server-side code or just client-side?
Mostly client-side. Code that ships to the browser is distribution under GPL and triggers attribution and source-availability duties. Server-side code that never leaves your server is generally not GPL-distribution, except for AGPL, where network use counts as distribution under section 13.
Is there any AI coding tool that is safer than others?
Paid Copilot Business and Enterprise plans include an IP indemnity from GitHub when the duplication filter is enabled. No equivalent commitment is standard on Cursor, Claude, Cody or free Copilot tiers as of May 2026. Verify current terms before relying on any vendor promise.
Related reading
Cluster pieces that pair with this one:
- The full liability picture for AI-built sites in Ireland. The hub article on GDPR, EAA and cookie-law liability for AI-assisted sites.
- AI-generated images on your website. The image side of the AI-output copyright question.
- Product Liability Directive 2024/2853. Strict-liability claims for damage from defective AI tools, applicable from 9 December 2026.
- Web designer copyright liability. The pre-AI parent article on agency-client copyright chains.
This article is technical analysis, not legal advice. The author is not your solicitor. For a binding view on a live licence question, talk to one.
Website Guides
Getty Images letter Ireland: CRRA 2000 response guide
Getty Images letter on your Irish website? What CRRA 2000 says, what Irish courts award and how to respond without overpaying. Covers PicRights and Higbee.
How to Scan Your Website for Copyrighted Images
Learn how to find copyrighted images on your website before enforcement agencies do. Manual and automated methods to check every image.
Safe Free Stock Photos for Irish Business Websites
Free image sources that are actually safe for Irish business websites, what the licences allow, and how the Irish Copyright and Related Rights Act 2000 fits.
AI-Generated Images on Irish Business Websites (2026)
Article 50(4) of the AI Act applies 2 Aug 2026. The four risk layers an Irish SMB should check before publishing AI-generated images on a website.