Skip to content

custom domains - middleware and verification #2145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

Soxasora
Copy link
Member

@Soxasora Soxasora commented May 1, 2025

Description

Part of #1942
Focuses on custom domain insertion, providing a working form and verification process via pgboss.
Uses ACM with LocalStack to provide certificates bare CRUD simulation.

Screenshots

TBD

Additional Context

#1958 is being split to facilitate review and build greater stuff on top of it.
Doesn't contain Auth Sync or custom branding.

This PR provides:

  • custom domains
    -- context
    -- crud, resolvers, fragments
    -- form with basic UI/UX
    -- ACM certificates support
    -- ACM certificate removal on STOPPED territories [context]; domain gets put on HOLD

  • middleware
    -- custom domain detection via endpoint
    -- doesn't apply referrals yet (no rewrites here)

  • domain verification
    -- job every 5 minutes
    -- robust AWS error handling
    -- log every verification attempt and its steps
    -- normalized schema for Domain, Record, Attempt, Certificate

Checklist

Are your changes backwards compatible? Please answer below:
Yes, everything is an addition. An exception is the way we handle S3 via localstack, now the container is called AWS and also allows ACM simulations other than S3.

On a scale of 1-10 how well and how have you QA'd this change and any features it might affect? Please answer below:
7, QA OK, domain verification handles errors correctly and follows its step, middleware correctly detects a custom domain and obtains its connected sub.

For frontend changes: Tested on mobile, light and dark mode? Please answer below:
Yes, it uses standard SN colors and styles that fits with theme changes.

Did you introduce any new environment variables? If so, call them out explicitly here:
An endpoint for localstack that can be contacted by our worker container
LOCALSTACK_ENDPOINT=http://aws:4566

DNS resolver for our DNS verification stage
DNS_RESOLVER=1.1.1.1

Soxasora added 2 commits May 1, 2025 18:38
- ACM support
- custom domains crud, resolvers, fragments
- custom domains form, guidelines
- custom domains context
- domain verification every 5 minutes via pgboss
- domain validation schema
- basic custom domains middleware, to be completed
- TODOs tracings
- CustomDomain -> Domain
- DomainVerification table
- CNAME, TXT, SSL verification types
- WIP DomainVerification upsert
@Soxasora Soxasora force-pushed the custom_domains_base branch from cd46aeb to 5e80c3f Compare May 2, 2025 23:33
@Soxasora Soxasora closed this May 6, 2025
@Soxasora Soxasora reopened this May 6, 2025
Soxasora added 9 commits May 6, 2025 14:51
- use DomainVerificationStatus enum for domains and records
- adapt Territory Form UI to new schema
- return 'records' as an object with its types
- wip: prepare for attempts and certificate usage for prisma
fix:
- fix setDomain mutation transaction
- fix schema typedefs

enhance:
- DNS records guidelines with flex-wrap for longer records

cleanup:
- add comments to worker
- remove console.log on validation values
@Soxasora Soxasora force-pushed the custom_domains_base branch from 775a966 to 82a71f5 Compare May 8, 2025 23:33
@Soxasora Soxasora marked this pull request as ready for review May 8, 2025 23:55
@huumn
Copy link
Member

huumn commented May 14, 2025

https://github.com/stackernews/stacker.news/blob/c7321351bfef74dc2cfa9d0c2542bb3fbbc5db2c/docs/dev/custom-domains-base.md

So to QA this I need to use real DNS? I suspect we'll want to mock DNS somehow, like we mock ACM etc, going forward to make this trivial to iterate on.

Copy link
Member

@huumn huumn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a very casual high level first pass to get acquainted. It looks very good.

Thank you for separating it into its own PR. Also thanks to @ekzyis for his first review which I'm sure led to lots of improvements.

I left some questions and nits. I'll make another high level pass tomorrow. Then maybe again depending on when we can test this in a self-contained way.

Comment on lines +32 to +34
if (process.env.NODE_ENV === 'development') {
config.endpoint = process.env.LOCALSTACK_ENDPOINT
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can define this once in the file scoped config variable rather than checking it in every function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feared AWS SDK (v2) reaction to undefined properties, but afaict it should ignore them, adjusted ^^

Comment on lines +151 to +153
if (existing.certificate) {
await deleteDomainCertificate(existing.certificate.certificateArn)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making network requests from inside an interactive tx is a bad idea. It consumes the db connection for the entire roundtrip.

I recommend deleting the certificate before entering the tx or after, depending on which makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! You're right, totally missed this. Going to push changes in a bit :)

Comment on lines +1 to +40
import prisma from '@/api/models'

// TODO: Authentication for this?
// API Endpoint for getting all VERIFIED custom domains, used by a cachedFetcher
export default async function handler (req, res) {
res.setHeader('Access-Control-Allow-Origin', '*')
res.setHeader('Access-Control-Allow-Methods', 'GET')
res.setHeader('Access-Control-Allow-Headers', 'Content-Type')

// Only allow GET requests
if (req.method !== 'GET') {
return res.status(405).json({ error: 'Method not allowed' })
}

try {
// fetch all VERIFIED custom domains from the database
const domains = await prisma.domain.findMany({
select: {
domainName: true,
subName: true
},
where: {
status: 'ACTIVE'
}
})

// map domains to a key-value pair
const domainMappings = domains.reduce((acc, domain) => {
acc[domain.domainName.toLowerCase()] = {
subName: domain.subName
}
return acc
}, {})

return res.status(200).json(domainMappings)
} catch (error) {
console.error('cannot fetch domains:', error)
return res.status(500).json({ error: 'Failed to fetch domains' })
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint just exists for the cachedFetcher call in lib/domains.js?

And that's because middleware can't import the prisma stuff, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly that, I checked Prisma Edge but the changes it needs for it to work on the actual codebase, makes the endpoint much more bearable.

btw I just noticed that I didn't clean this endpoint, brb

Comment on lines +136 to +137
-- !QUIRK: If someone else takes over the sub, the new guy needs to delete the domain
CREATE OR REPLACE FUNCTION hold_domain_and_delete_certificate()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is weird. I guess it's not clear to us whether the transfer is from an alt to another alt or to another person entirely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can actually just remove the custom domains settings and certificate if the owner is different on re-activation.

And, instead, on transfer, we retain settings

Comment on lines +19 to +30
// make sure to delete the certificate from ACM if the sub is stopped, if we have it.
if (nextStatus === 'STOPPED' && sub.domain?.certificate?.certificateArn) {
await deleteDomainCertificate(sub.domain.certificate.certificateArn)
}

await models.sub.update({
include: { user: true },
where: {
name: subName
},
data: {
status: nextBillingWithGrace(sub) >= new Date() ? 'GRACE' : 'STOPPED',
status: nextStatus,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm hard to tell if this is the right behavior unless we automatically re-add the cert when it's paid.

Copy link
Member Author

@Soxasora Soxasora May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a new certificate from ACM is fast and free, what's not really good (for the territory owner) is that they will be subjected again to ACM DNS validation.

Maybe we don't need to remove the certificate at all as long as the domain remains the same but this is also somewhat connected to what you said on the previous comment:

I guess it's not clear to us whether the transfer is from an alt to another alt or to another person entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants