Commit c35e7117 authored by Sascha's avatar Sascha
Browse files

removed PII

parent c87ac296
MIT License
Copyright (c) 2019 Kristian Freeman
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# License:
Copyright 2021 esaqa GmbH
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
# Licenses of used software
## cloudflare/workers-google-analytics: MIT
MIT License
Copyright (c) 2019 Kristian Freeman
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# workers-google-analytics
# cloudflare-worker-private-google-analytics
Middleware for proxying Google Analytics pageviews via Workers
Middleware for proxying Google Analytics pageviews via Workers that removes all personal identifiable information
## Usage
Install the package and include it in your Cloudflare Workers code. You should pass in a list of origins that will be whitelisted for sending Analytics events to your Google Analytics account.
```sh
$ yarn add workers-google-analytics
$ yarn add cloudflare-worker-private-google-analytics
```
```js
// index.js
import analytics from 'workers-google-analytics'
import analytics from 'cloudflare-worker-private-google-analytics'
const analyticsResp = await analytics(event, {
allowList: ['bytesized.xyz'],
......@@ -52,4 +52,4 @@ In the `head` section of your web application, load the script helper and begin
The provided script helper deployed at `ga-helper.developers.workers.dev` is an example -- while you _can_ use it in production, we can't promise that it won't eventually be blocked by uBlock and other similar tools.
To mitigate this, the Workers code for that domain is available in the [`helper` directory](https://github.com/signalnerve/workers-google-analytics/tree/main/helper). You can take that code and deploy it to your own workers.dev subdomain (or a custom domain) and use it accordingly.
To mitigate this, the Workers code for that domain is available in the [`helper` directory](https://gitlab.com/esaqa/workers-google-analytics/tree/main/helper). You can take that code and deploy it to your own workers.dev subdomain (or a custom domain) and use it accordingly.
......@@ -4,7 +4,7 @@ This codebase is used to serve a Google Analytics script from your own Workers.d
To deploy this tool, update `wrangler.toml` with your account ID. By default, the tool will deploy to your workers.dev subdomain, though you can fill out `zone_id` and `route` if you'd like to deploy to a custom domain. By default, the `route` definition just serves this code from `.cloudflare/analytics.js`, meaning you can deploy it on a existing site without clobbering the rest of your application logic.
If you're using this tool with the Workers code in [workers-google-analytics](https://github.com/signalnerve/workers-google-analytics), you should update the corresponding `script` tag to point to your own unique instance of the `analytics.js` file:
If you're using this tool with the Workers code in [workers-google-analytics](https://gitlab.com/esaqa/workers-google-analytics), you should update the corresponding `script` tag to point to your own unique instance of the `analytics.js` file:
```html
<script type="text/javascript" src="https://ga-helper.yoursubdomain.workers.dev/_cf/analytics.js"></script>
......
......@@ -5,8 +5,8 @@
</head>
<body>
<div class="centered">
<h1>Google Analytics Middleware for Workers</h1>
<h2>Learn more <a href="https://github.com/signalnerve/workers-google-analytics">on GitHub</a>.</h2>
<h1>Private Google Analytics Middleware for Workers</h1>
<h2>Learn more <a href="https://gitlab.com/esaqa/workers-google-analytics">on GitLab</a>.</h2>
</div>
</body>
</html>
{
"name": "workers-google-analytics",
"version": "0.0.2",
"description": "Middleware for proxying Google Analytics pageviews via Workers",
"name": "cloudflare-worker-private-google-analytics",
"version": "0.0.3",
"description": "Middleware for proxying Google Analytics pageviews via Workers that removes all personal identifiable information",
"main": "build/main/index.js",
"typings": "build/main/index.d.ts",
"module": "build/module/index.js",
"repository": "https://github.com/signalnerve/workers-google-analytics",
"license": "MIT",
"repository": "https://gitlab.com/esaqa/workers-google-analytics",
"license": "Apache-2.0",
"keywords": [],
"scripts": {
"build": "run-p build:*",
......
......@@ -3,16 +3,31 @@ const GA_ENDPOINT = `https://www.google-analytics.com/collect`;
const originallowlist: string[] = [];
function allowlistDomain(domain: string, addWww = true) {
const prefixes = ["https://", "http://"];
const prefixes = ['https://', 'http://'];
if (addWww) {
prefixes.push("https://www.");
prefixes.push("http://www.");
prefixes.push('https://www.');
prefixes.push('http://www.');
}
prefixes.forEach((prefix) => originallowlist.push(prefix + domain));
}
function cid() {
return String(Math.random() * 1000); // They use a decimal looking format. It really doesn't matter.
async function cid(uip, ua) {
const myText = new TextEncoder().encode(uip + ua);
const myDigest = await crypto.subtle.digest(
{
name: 'SHA-256',
},
myText // The data you want to hash as an ArrayBuffer
);
const val = new Uint8Array(myDigest);
const encoded: string[] = [];
for (let i = 0; i < val.length; i++) {
encoded.push('0123456789abcdef'[(val[i] >> 4) & 15]);
encoded.push('0123456789abcdef'[val[i] & 15]);
}
return encoded.join('');
}
async function proxyToGoogleAnalytics(event: FetchEvent) {
......@@ -20,27 +35,34 @@ async function proxyToGoogleAnalytics(event: FetchEvent) {
const url = new URL(event.request.url);
const params =
event.request.method.toUpperCase() === "GET"
event.request.method.toUpperCase() === 'GET'
? url.searchParams
: new URLSearchParams(await event.request.text());
const headers = event.request.headers || <Headers>{};
// attach other GA params, required for IP address since client doesn't have access to it. UA and CID can be sent from client
params.set("uip",
headers.get('cf-connecting-ip') || headers.get("x-forwarded-for") || headers.get("x-bb-ip") || ""
); // ip override. Look into headers for clients IP address, as opposed to IP address of host running lambda function
params.set("ua", params.get("ua") || headers.get("user-agent") || ""); // user agent override
params.set(
"cid",
params.get("cid") || cid()
);
// ip override. Look into headers for clients IP address, as opposed to IP address of host running lambda function
let uip =
headers.get('cf-connecting-ip') ||
headers.get('x-forwarded-for') ||
headers.get('x-bb-ip') ||
'';
const uipSplit = uip.split('.');
if (uipSplit.length === 4) {
uipSplit[3] = '0';
uip = uipSplit.join('.');
}
params.set('uip', uip);
const ua = params.get('ua') || headers.get('user-agent') || ''; // user agent override
params.set('ua', ua);
params.set('cid', await cid(uip, ua));
return fetch(
new Request(GA_ENDPOINT, {
method: "POST",
method: 'POST',
headers: {
"Content-Type": "image/gif",
'Content-Type': 'image/gif',
},
body: params.toString(),
})
......@@ -49,25 +71,27 @@ async function proxyToGoogleAnalytics(event: FetchEvent) {
export async function proxyToGA(event: FetchEvent) {
const origin =
event.request.headers.get("origin") || event.request.headers.get("Origin") || "";
event.request.headers.get('origin') ||
event.request.headers.get('Origin') ||
'';
const isOriginallowlisted = originallowlist.indexOf(origin) >= 0;
let cacheControl = "no-store";
let cacheControl = 'no-store';
const url = new URL(event.request.url);
if (url.searchParams.get("ec") == "noscript") {
cacheControl = "max-age: 30";
if (url.searchParams.get('ec') == 'noscript') {
cacheControl = 'max-age: 30';
}
const headers = {
"Access-Control-Allow-Origin": isOriginallowlisted
'Access-Control-Allow-Origin': isOriginallowlisted
? origin
: originallowlist[0],
"Cache-Control": cacheControl,
'Cache-Control': cacheControl,
};
try {
if (event.request.method === "OPTIONS") {
if (event.request.method === 'OPTIONS') {
// CORS (required if you use a different subdomain to host this function, or a different domain entirely)
return new Response(null, { status: 204, headers: headers });
} else if (!origin || isOriginallowlisted) {
......@@ -76,12 +100,15 @@ export async function proxyToGA(event: FetchEvent) {
return new Response(null, { status: 401 });
}
} catch (err) {
return new Response(err.toString(), { status: 500 })
return new Response(err.toString(), { status: 500 });
}
}
export default async (event: FetchEvent, { allowList }: { allowList: string[] }): Promise<void | Response> => {
allowList.forEach(origin => allowlistDomain(origin))
const url = new URL(event.request.url)
if (url.pathname.includes(".cloudflare/ga")) return proxyToGA(event)
}
\ No newline at end of file
export default async (
event: FetchEvent,
{ allowList }: { allowList: string[] }
): Promise<void | Response> => {
allowList.forEach((origin) => allowlistDomain(origin));
const url = new URL(event.request.url);
if (url.pathname.includes('.cloudflare/ga')) return proxyToGA(event);
};
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment