Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HTTP1.1] Host header sent twice if also specified in request headers #177

Open
cocoahero opened this issue Aug 28, 2024 · 8 comments
Open

Comments

@cocoahero
Copy link

cocoahero commented Aug 28, 2024

While attempting to use async-http (0.71.0) with against a reverse proxy where we need to specify a Host header, the library sends the header twice: one with the URL's hostname value, and one with the value specified in the request. For most web servers, this results in a 400 Bad Request.

url = URI.parse("https://my.reverse.proxy.com/path/to/resource")

headers = {
  "Host" => "api.example.com"
}

body = JSON.dump({ foo: "bar" })

Sync do
  Async::HTTP::Internet.post(url, headers, body) do |resp|
    puts resp.inspect
  end
end

It looks like this is coming from protocol-http1 where the host header is always written, irrespective if it also exists in headers. https://github.com/socketry/protocol-http1/blob/main/lib/protocol/http1/connection.rb#L133-L138

The reverse proxy use case is one, but it could also be for any situation where the TCP/TLS socket connection should be made to a different authority than what the HTTP Host header dictates. For example, this should also be completely valid:

url = URI.parse("http://localhost:9292/path/to/resource")

headers = {
  "Host" => "example.multi.tenant.app.com" # used to lookup SaaS tenant, etc
}

body = JSON.dump({ foo: "bar" })

Sync do
  Async::HTTP::Internet.post(url, headers, body) do |resp|
    puts resp.inspect
  end
end
@ioquatix
Copy link
Member

ioquatix commented Aug 28, 2024

What you are trying to do is possible, but it's probably not achievable with Async::HTTP::Internet which is a canonical/opinionated interface for accessing internet resources following standard conventions.

More specifically, the host header is not part of HTTP/2+, and therefore the library has a specific field for dealing with host identification, referred to as authority. In HTTP/1, authority maps directly to the host header, and in HTTP/2+ it maps to the :authority pseudo header.

Note that this can be different from the address you connect to (in your case the reverse proxy), and even different from the hostname used for TLS negotiation (can be used to select certificates for validating the connection is secure).

According to your initial problem, something like this will work for you:

server_endpoint = ::IO::Endpoint.tcp("my.reverse.proxy.com", 443)
host_endpoint = ::Async::HTTP::Endpoint.parse("https://api.example.com/path/to/resource", server_endpoint)
client = ::Async::HTTP::Client.new(host_endpoint)

client.post(...)

This will configure the authority and TLS as if it was talking to api.example.com.

@cocoahero
Copy link
Author

Correct me if I am wrong, but creating a client in this way will prevent connection reuse/persistent connections since we will need a Client instance per combination of TCP endpoint and host/authority. In reality, we want to reuse the connection to the TCP endpoint, but just change the Host header per request.

@ioquatix
Copy link
Member

ioquatix commented Aug 30, 2024

It will depend on your TLS setup, if you are expecting SNI (which is at the connection level, usually for virtual hosts) to work, you will need the setup I suggested, and yes, you will want to cache the client instances and yes, you are correct, it will be one connection at least per SNI hostname. If you don't need SNI, and only need to change the authority, you can do this instead:

def get(client, url, headers = Protocol::HTTP::Headers.new)
  client.call(
    Protocol::HTTP::Request.new(nil, url.hostname, "GET", url.path, headers, body)
  )
end

The 2nd argument to the Request.new is the authority (host) to use.

If this was more commonly requested, we could consider adding a keyword argument to the method helpers, e.g. internet.get(..., authority: ...) - however I don't know if your issue encompasses SNI or not - can you give me more details about that part of your setup?

@epk
Copy link

epk commented Aug 31, 2024

We are essentially trying to do the equivalent of the following Go code with async-http, where we want to keep a pool of persistent connections to the reverse proxy.

The reverse proxy only has a TLS certificate for my.tld.com but has millions of virtual hosts.

We use Cloudflare SaaS where they connect to the reverse proxy/loadbalancer with TLS/SNI set to my.tld.com and the Host/:authority header set to the tenant's domain.

https://developers.cloudflare.com/cloudflare-for-platforms/cloudflare-for-saas/reference/connection-details/

Show Code
package main

import (
    "crypto/tls"
    "fmt"
    "io"
    "net"
    "net/http"
    "net/http/httptrace"
    "time"
)

func main() {
    // Create a custom dialer
    dialer := &net.Dialer{
  	  Timeout:   30 * time.Second,
  	  KeepAlive: 30 * time.Second,
    }

    // Create a custom TLS config
    tlsConfig := &tls.Config{
  	  ServerName: "nuc.adi.run",
    }

    // Create a custom transport
    transport := &http.Transport{
  	  DialContext:         dialer.DialContext,
  	  TLSClientConfig:     tlsConfig,
  	  ForceAttemptHTTP2:   true,
  	  MaxIdleConns:        100,
  	  IdleConnTimeout:     90 * time.Second,
  	  TLSHandshakeTimeout: 10 * time.Second,
    }

    // for trace events
    clientTrace := &httptrace.ClientTrace{
  	  GetConn: func(hostPort string) { fmt.Println("starting to create conn ", hostPort) },
  	  GotConn: func(info httptrace.GotConnInfo) { fmt.Printf("connection established %+v\n", info) },
  	  TLSHandshakeDone: func(info tls.ConnectionState, err error) {
  		  fmt.Printf("TLS handshake done %+v\n", info)
  	  },
  	  WroteHeaderField: func(key string, value []string) {
  		  if key == ":authority" {
  			  fmt.Println("Wrote header field ", key, value)
  		  }
  	  },
  	  ConnectStart: func(network, addr string) { fmt.Println("starting tcp connection", network, addr) },
  	  ConnectDone:  func(network, addr string, err error) { fmt.Println("tcp connection created", network, addr, err) },
    }

    // Create a client with the custom transport
    client := &http.Client{
  	  Transport: transport,
    }

    // Create a new request
    req, err := http.NewRequest("GET", "https://nuc.adi.run", nil)
    if err != nil {
  	  fmt.Println("Error creating request:", err)
  	  return
    }
    doRequest(client, clientTrace, req, "code.aditya.me")

    fmt.Println()

    req, err = http.NewRequest("GET", "https://nuc.adi.run", nil)
    if err != nil {
  	  fmt.Println("Error creating request:", err)
  	  return
    }
    doRequest(client, clientTrace, req, "jellyfin.aditya.me")
}

func doRequest(client *http.Client, clientTrace *httptrace.ClientTrace, req *http.Request, overrideHostName string) {
    clientTraceCtx := httptrace.WithClientTrace(req.Context(), clientTrace)
    req = req.WithContext(clientTraceCtx)

    // override Host header
    req.Host = overrideHostName

    resp, err := client.Do(req)
    if err != nil {
  	  fmt.Println("Error performing request:", err)
  	  return
    }
    defer resp.Body.Close()

    io.Copy(io.Discard, resp.Body)
    fmt.Println("Response status:", resp.Status)
}
go run main.go 
starting to create conn  nuc.adi.run:443
starting tcp connection tcp 10.0.0.2:443
tcp connection created tcp 10.0.0.2:443 <nil>
TLS handshake done {Version:772 HandshakeComplete:true DidResume:false CipherSuite:4865 NegotiatedProtocol:h2 NegotiatedProtocolIsMutual:true ServerName:nuc.adi.run PeerCertificates:[0x140001d8008 0x140001d8588] VerifiedChains:[[0x140001d8b08 0x140001d9088 0x140001d9608]] SignedCertificateTimestamps:[] OCSPResponse:[] TLSUnique:[] ECHAccepted:false ekm:0x104d76c80 testingOnlyDidHRR:false testingOnlyCurveID:29}
connection established {Conn:0x14000192008 Reused:false WasIdle:false IdleTime:0s}
Wrote header field  :authority [code.aditya.me]
starting to create conn  nuc.adi.run:443
connection established {Conn:0x14000192008 Reused:true WasIdle:true IdleTime:53.75µs}
Wrote header field  :authority [code.aditya.me]
Response status: 200 OK

starting to create conn  nuc.adi.run:443
connection established {Conn:0x14000192008 Reused:true WasIdle:true IdleTime:82.5µs}
Wrote header field  :authority [jellyfin.aditya.me]
starting to create conn  nuc.adi.run:443
connection established {Conn:0x14000192008 Reused:true WasIdle:true IdleTime:30.375µs}
Wrote header field  :authority [jellyfin.aditya.me]
Response status: 200 OK

@ioquatix
Copy link
Member

Okay, I understand, SNI is not an issue for you.

Similar request has come up in the past, so I've introduced support for keyword arguments to Async::HTTP::Internet, you can specify authority: in your case to set the host header.

See https://socketry.github.io/async-http/releases/index.html#async::http::internet-accepts-keyword-arguments for more details.

@cocoahero
Copy link
Author

Thanks @ioquatix, this unblocks us for now. As an aside, do you think it would also be beneficial to add a sanity check or other defensive measure to ensure the host header is not sent twice? It was not immediately intuitive when we received 400s on the reason why, and we were only able to figure it out because we had access to nginx logs.

@cocoahero
Copy link
Author

Unfortunately, it appears that this change does not play well with WebMock. They use the request.authority when building the stub signature, which results in looking for the wrong host name. https://github.com/bblimke/webmock/blob/master/lib/webmock/http_lib_adapters/async_http_client_adapter.rb#L97-L106

@ioquatix
Copy link
Member

Thanks, I'll take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants