AI-Generated Code and Open-Source Licences (Ireland)

Steven | TrustYourWebsite · 15 May 2026 · Last updated: May 2026

A Series-A due-diligence solicitor flags two short JavaScript functions in your front-end bundle as substantially similar to a GPL-3.0 file on GitHub. Your developer used Cursor to write the form-validation logic and didn't think about it again. This article walks through whether that exposure is real, who carries it and what to do about it.

Irish copyright sits under the Copyright and Related Rights Act 2000 (CRRA 2000, Number 28 of 2000), most recently amended by S.I. No. 567 of 2021, the European Union (Copyright and Related Rights in the Digital Single Market) Regulations 2021. The 2021 regulations transposed Directive (EU) 2019/790, the DSM Directive, into Irish law. That body of statute is the framework an Irish small business operator is held to when a maintainer or a due-diligence solicitor raises a question about AI-suggested code in a public bundle. The Software Directive 2009/24/EC and the Information Society Directive 2001/29/EC continue to apply in parallel, transposed by earlier instruments still in force.

Two provisions in the 2000 Act do most of the work in this area. Section 2(1) defines a literary work as including a computer program, which means a code file attracts copyright on creation without registration and without any deposit step. Section 21(f) addresses computer-generated works, defining the first author as the natural or legal person by whom the arrangements necessary for the creation of the work are undertaken. That section was drafted in 1999 with database engines and procedural generators in mind, and applying it to a 2026 large language model is a matter of statutory interpretation the High Court has not yet ruled on. What it does mean in practice is that the developer who prompts Copilot or Cursor is the most plausible candidate for "the person by whom the arrangements necessary" are taken, not the AI vendor and not the model. Where that developer is acting under a commission, section 23 then assigns first ownership to the commissioner unless the agreement says otherwise.

The DSM transposition matters separately for the upstream training question. The Article 4 text and data mining exception in Directive (EU) 2019/790 is brought into Irish law by the 2021 regulations and permits commercial TDM, including the training of a coding assistant, unless the rightsholder has reserved its rights expressly and in a machine-readable form. Public GitHub repositories do not usually carry a machine-readable opt-out, and the open-source licence terms attached to them are licence grants rather than reservations against TDM. That is the legal framework AI vendors rely on to train on the open-source corpus in the first place, and it sits underneath everything that flows downstream to the operator. An Irish operator's exposure is not on the training side. It is on the redistribution side, where the operator hands the output to visitors as part of a JavaScript bundle.

Enforcement of copyright disputes runs through the Commercial Court list of the High Court when the value or the complexity warrants it. Smaller claims go to the Circuit Court or the District Court. The Intellectual Property Office of Ireland (IPOI) in Kilkenny administers patents and trademarks but does not arbitrate copyright disputes, which are pure court matters. The Competition and Consumer Protection Commission (CCPC) is adjacent and not central: misleading commercial practices that arise from how an AI-built site is sold to consumers can sit within the Consumer Protection Act 2007, but the CCPC does not enforce copyright. For copyright specifically, the live parties are the rightsholder, the operator and the High Court. The Data Protection Commission (DPC) sits to one side as well, because copyright over AI-generated code is not in the DPC's remit, but adjacent breaches around the same site often are.

The Doe v. GitHub litigation in the Northern District of California is the most-cited live case in this area worldwide, but a Californian District Court ruling does not bind an Irish judge. The interpretive weight in Ireland comes from how the High Court would read the Software Directive 2009/24/EC, the Information Society Directive 2001/29/EC and the DSM Directive against the facts. The Court of Justice of the European Union has handled adjacent originality questions in Infopaq International A/S v Danske Dagblades Forening (C-5/08) and Painer v Standard Verlags GmbH (C-145/10), and an Irish court would normally begin with those before considering US persuasive material. The practical implication for an Irish operator is that public-facing risk is governed by Irish-statute remedies, with the open-source licence treated as a contractual instrument under Irish contract law, even when the headline case originates in the United States.

GitHub itself has its EMEA headquarters in Dublin, and a substantial Irish small-business customer base uses Copilot under terms governed by Irish law for European subscribers. The IP indemnification clause in Copilot Business and Enterprise contracts is governed by the same European terms. That is a useful jurisdictional fact: when an Irish operator relies on the indemnification, the enforcing forum and the substantive law applied to the indemnification itself can be Irish rather than Californian. The clause does not change who is sued first by an upset maintainer, but it does change who pays at the end if the indemnification is triggered.

How AI-suggested code becomes a license exposure on your website.Five-stage horizontal flow showing how a prompt to a coding assistant turns into a license obligation for the website operator. Stage one is the developer prompt to Cursor, Copilot or Claude. Stage two is the AI suggestion that may reproduce training-data patterns. Stage three is the code landing in the agency repository with no license metadata preserved. Stage four is the bundle served to browser visitors. Stage five is public distribution, where GPL, MIT or Apache obligations are triggered. Beneath the flow two horizontal bars compare server-side code with lower exposure for non-AGPL projects against client-side JavaScript with full distribution exposure to end users. A right-side annotation references the January 2024 ruling in Doe v. GitHub that dismissed certain DMCA section 1202(b) claims for near-identical output and allowed open-source license breach claims to proceed.From prompt to public distributionDeveloperpromptCursor / Copilot/ ClaudeAI suggestionmay reproducetraining-datapatternsAgency repono licensemetadatapreservedBundled intoyour siteserved to browservisitorsPublic distributionGPL / MIT / Apacheobligationstriggered hereExposure by code locationServer-side code: lower exposure (AGPL excepted)Client-side JavaScript: full distribution to end usersDoe v. GitHub, January 2024DMCA section 1202(b) claims for "near-identical,not verbatim" output dismissed with prejudice.Open-source licence-breach claims allowedto proceed. Case ongoing.Where the licence weight livesThe maintainer who notices their code reaches out to the entity distributing it. That isthe site operator, not the developer and not the AI vendor. The developer's contract withthe AI vendor stays in the background. The operator handles the public-facing question.
The legal weight sits at the last stage. The further down the chain you sit, the more you carry.

What the AI actually did

Coding assistants like GitHub Copilot, Cursor, Claude and Cody were trained on huge volumes of public source code, including repositories under GPL, MIT, Apache and BSD licences. The training process did not preserve attribution metadata, and the models learned patterns rather than entire files. When a developer prompts the assistant, the model produces an output that is sometimes a novel construction and sometimes a near-identical reproduction of a specific training-data file. The assistant does not warn the developer which is which, and it does not emit a SPDX header or a copyright notice.

That is the technical fact at the bottom of the legal question. The model is not licensed to redistribute training-data code, and the developer is not warned when the output is structurally close to a specific source.

Who is exposed

The site operator distributes the code that ships to visitors. A browser loading your homepage receives the JavaScript bundle. Under GPL and similar copyleft licences, that is distribution to the end user. The operator is the entity making it available, regardless of whether the operator wrote the line of code or the agency did or an AI suggested it.

This is the same liability chain that applies to web-designer-introduced copyright issues. The pre-AI version of the problem is a designer who dropped an unlicensed Getty photo into the carousel. The post-AI version is a developer who accepted a Copilot suggestion that reproduced a GPL source file. The structure is the same. The public-facing party is the operator. The internal cost allocation between operator and agency is contract.

Sitting next to this is the broader question of who pays when AI-built sites break compliance. GDPR enforcement by the Data Protection Commission and EAA enforcement under the Irish transposing regulations flow to the operator on the same principle. Copyright on AI-generated code is the copyright corner of that same map.

What the courts have actually said

The leading case is Doe v. GitHub, Inc., filed November 2022 in the Northern District of California. Anonymous developer plaintiffs sued GitHub, Microsoft and OpenAI over Copilot's training on public open-source code. The procedural posture moves, and the table below is a snapshot as of May 2026. Re-verify before relying on it.

<!-- LAST VERIFIED: 2026-05-15 -->

Doe v. GitHub claim-by-claim status, May 2026.

ClaimStatus as of May 2026What it means for your site
DMCA § 1202(b) on removing copyright management informationDismissed with prejudice, January 2024, for "near-identical" outputsPlaintiffs would need verbatim reproduction to revive. Risk for SMBs: low on this specific theory.
Breach of open-source licence terms (MIT, GPL, Apache and others)Allowed to proceedOpen-source licences are treated as enforceable contracts. Risk for SMBs: moderate where client-side code distributes the output.
Tortious interference and unfair competitionMixed dispositions, some claims survivedNot directly SMB-relevant. The dispute is between the plaintiffs and the AI provider.
Unjust enrichmentDismissedNot SMB-relevant.

A live procedural posture. Re-verify before relying on it.

Doe v. GitHub — claim-by-claim status, May 2026Four-row table summarising the procedural status of the main claim theories in Doe v. GitHub as of May 2026. A US Northern District of California case; an Irish court would treat it as persuasive, not binding. DMCA section 1202(b) was dismissed with prejudice in January 2024 for near-identical outputs. Breach of open-source licence terms was allowed to proceed. Tortious interference and unfair competition had mixed dispositions. Unjust enrichment was dismissed. Each row notes the practical relevance for an Irish small or medium business website operator.Doe v. GitHub — claim-by-claim status, May 2026ClaimStatus (May 2026)What it means for your Irish siteDMCA § 1202(b) on removingcopyright managementinformationDismissedwith prejudice, Jan 2024US-specific statute; persuasive onlyin Ireland. Plaintiffs would needverbatim reproduction to revive.Risk for Irish SMBs: low.Breach of open-source licenceterms (MIT, GPL, Apacheand others)Proceedingallowed to continueIrish contract law would similarlytreat open-source licences asenforceable. Risk for SMBs:moderate on client-side code.Tortious interferenceand unfair competitionMixedsome claims survivedNot directly SMB-relevant. Disputeis between plaintiffs and the AIprovider, not the site operator.Unjust enrichmentDismissedNot SMB-relevant.Snapshot, May 2026. US case — persuasive context only for an Irish court. Re-verify with the docket.N.D. Cal. 4:22-cv-06823-JST · Trackers: bakerlaw.com/the-copilot-litigation · githubcopilotlitigation.com
Four claim theories from a US case. One dismissed with prejudice; one still live as a contract theory; the other two not directly SMB-relevant.

The headline takeaway is narrow. The court has not yet ruled on the central substantive question of whether AI-generated output substantially similar to training code violates the original licence. What it has done is sorted the claim theories. The technical "removal of copyright management information" route under DMCA § 1202(b) is closed where the output is "near-identical with semantically insignificant variations." The contract route, treating an open-source licence as a binding agreement that the AI provider's use violated, is still live. Procedural updates appear on the BakerHostetler tracker and on the plaintiffs' counsel's case page. The plaintiffs' page is one side's framing and should be treated as such.

GPL distribution and your website

The legal question turns on a technical one. What counts as distributing the code?

GPL-style copyleft licences attach attribution and source-availability duties to anyone who distributes a covered work. Distribution to an end user is the trigger. For a website, this maps to two cases.

Client-side JavaScript that ships to the visitor's browser is distribution. Every page load delivers the bundle to a third party, which is the GPL distribution case. If the bundle contains code that is substantially similar to a GPL-licensed file, attribution and source-availability duties apply.

Server-side code that never leaves your server is generally not GPL distribution. The exception is AGPL, where Section 13 treats network use as distribution. Most SMB sites do not run AGPL-licensed backend code, so the practical exposure is concentrated in the client-side bundle: form validation, animations, modals, helper utilities, the kind of small functions a developer asks an AI to write.

This is why the AI-code question matters more for the front end than for the back end of your site. A WordPress plugin that uses Copilot-suggested PHP on the server runs at lower exposure for non-AGPL code than a React component the assistant wrote that ships to every visitor.

How realistic is the risk

Honest probability hierarchy, in order from most to least likely.

The first realistic scenario is an investor or acquirer running due diligence on your codebase before a funding round or an exit. Their lawyers run a licence scanner like FOSSA, ScanCode or licensee. If the scanner flags GPL-licensed code in a proprietary product, the deal-team asks questions. The outcome is usually a remediation budget and a delay, not a killed deal. This is the most common way SMBs find out they have a problem.

The second is an open-source maintainer noticing their code in your public bundle. Larger projects have community members who watch for unattributed reuse. The first contact is a polite email asking for attribution. Escalation looks like a DMCA takedown sent to your host, which interrupts service until you respond. Lawsuits at this level are rare for SMBs because the cost of bringing one outweighs the recovery against a small business.

The third is enforcement by a copyleft-licence steward organisation such as the Software Freedom Conservancy. These groups do bring enforcement actions, but their pattern is to engage in long correspondence first and to target hardware vendors or larger software companies. The threshold for an SMB website is high.

In practice, the realistic week-to-week risk for a small business site is zero. The risk concentrates around three moments: a funding round, an acquisition or a maintainer searching the internet for their distinctive function. None of these is likely in a given month, but all are predictable and avoidable.

Practical mitigation if you or your developer use AI tools

Five things to do. None of these is a legal defence and none should be sold to you as one. They are engineering hygiene that reduces the chance the problem ever surfaces.

First, turn on the duplication filter in Copilot, Cursor or any other coding assistant that offers one. The filter blocks suggestions that match training-data code above a similarity threshold. It does not eliminate near-identical output, but it does reduce the worst case. Confirm the setting is on in the developer's actual editor configuration, not just on the team account.

Second, run a licence scanner before deployment. Free tools include licensee, scancode-toolkit and ort (the OSS Review Toolkit). Commercial options include FOSSA, Snyk Licence and Black Duck. The scanner reads your package manifests and your source tree and flags licences that conflict with your distribution model. Running this once on the production bundle is more useful than running it never.

Third, if your developer is on paid Copilot Business or Enterprise, GitHub offers an IP indemnification commitment against third-party claims arising from Copilot output, conditional on the duplication filter being enabled. This is a meaningful contractual backstop, but it is conditional on the filter setting, limited to the named plans and verifiable only against the current terms before relying on it. Free Copilot, Cursor, Claude and Cody do not, as of May 2026, offer equivalent commitments.

Fourth, update your agency contract. Add a clause that the agency will not use AI-assisted code that incorporates GPL or AGPL output without explicit written notice to you, and that the agency warrants the delivered site does not infringe third-party licences. This does not protect you from the maintainer who notices. It does give you a route to push the cost back to the agency if a claim arises.

Fifth, keep a software bill of materials for your client-side bundle. Tools like cyclonedx-bom or the SBOM exports built into modern bundlers list every dependency and its licence. If a question arises in a year, having an SBOM from the release in question saves a week of work.

Our free compliance scan covers GDPR, cookies, accessibility and image rights on the live site. It does not check open-source licence compliance, which is a separate developer-tooling job. Treat the two as parallel tracks on the same site.

What changes on 9 December 2026

Directive (EU) 2024/2853, the new Product Liability Directive, treats software including AI systems as products from 9 December 2026. Ireland must transpose by that date. Article 4 brings AI tools into scope. Article 2(2) excludes open-source software developed outside a commercial activity, so the public open-source maintainer is not the defendant in a PLD claim. The commercial AI vendor is.

The relevance to AI-generated code is narrow. An Irish small business harmed by a defective AI tool, for example where the AI emits code with a security flaw that leads to a data breach with downstream harm to a natural person, may have a new no-fault claim path against the AI vendor under the directive. The claim is for damage to natural persons, and it applies only to products placed on the market after 9 December 2026. The PLD is not a route for the operator to recover a Data Protection Commission fine and it does not retroactively reach pre-existing tools. The Product Liability Directive in depth covers the scope and exclusions.

What this is not

This article is about open-source licence exposure when an AI writes code on your site. Three adjacent topics share words with this one.

Chatbot disclosure and AI-generated marketing copy labelling sit under Article 50 of the AI Act and AI-generated content, which is a separate regime from source-code copyright. The image side, where AI-generated illustrations or photographs may infringe, lives in AI-generated images on your site. The cookie-banner and accessibility version of the same liability chain is the broader GDPR and accessibility question. For the pre-AI image-letter version, see the Getty Images letter guide.

Common Questions

Does Copilot's duplication filter eliminate the risk?

No. The filter reduces the chance of verbatim reproduction of training-data code, which is the worst case. It does not address near-identical output that still resembles a specific open-source file. Treat the filter as risk reduction, not as a legal shield.

Am I liable if my freelancer used Cursor without telling me?

The site operator is the party distributing the code to visitors. An open-source maintainer who notices their code in your bundle writes to the domain owner. Your freelancer may owe you a fix under contract, but the public-facing exposure sits with you.

Does this apply to server-side code or just client-side?

Mostly client-side. Code that ships to the browser is distribution under GPL and triggers attribution and source-availability duties. Server-side code that never leaves your server is generally not GPL-distribution, except for AGPL, where network use counts as distribution under section 13.

Is there any AI coding tool that is safer than others?

Paid Copilot Business and Enterprise plans include an IP indemnity from GitHub when the duplication filter is enabled. No equivalent commitment is standard on Cursor, Claude, Cody or free Copilot tiers as of May 2026. Verify current terms before relying on any vendor promise.

Cluster pieces that pair with this one:

This article is technical analysis, not legal advice. The author is not your solicitor. For a binding view on a live licence question, talk to one.

Share this article