Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't warn when we get invalid request lines #18

Closed
wants to merge 1 commit into from

Conversation

hiratara
Copy link
Contributor

When I send an invalid request to Starlet, I get following warnings. If someone sends such requests many times, my log file will be filled with this message.

% carton exec -- local/bin/plackup -Ilib -s Starlet -e 'sub { [200, [], ["OK"]] }'
Plack::Handler::Starlet: Accepting connections at http://0:5000/
Use of uninitialized value $protocol in string eq at lib/Starlet/Server.pm line 411.
Use of uninitialized value $protocol in string eq at lib/Starlet/Server.pm line 428.
% telnet localhost 5000
Connected to localhost.
Escape character is '^]'.
GET /invalid request line/ HTTP/1.0

HTTP/1.1 400 Bad Request
Date: Wed, 31 Dec 2014 04:08:54 GMT
Server: Plack::Handler::Starlet
Content-Type: text/plain

Bad RequestConnection closed by foreign host.

My patch stops these messages and returns Connection: closed header to disconnect the keep-alive connection.

@kazuho
Copy link
Owner

kazuho commented Jan 1, 2015

Thank you for the pull request. I agree that the behaviour should be fixed.

However I wonder if passing a broken request to PSGI handler conforms to the specification. OTOH users may prefer to log that broken requests were received by the server, and in order to log them the requests need to be passed to the handler (e.g. some access-logging middleware).

@miyagawa Do you have any recommendation how the issue should be handled?

@miyagawa
Copy link
Contributor

miyagawa commented Jan 2, 2015

I don't have a strong opinion about this and it's totally up to the server implementations. For example with Apache + (Fast)CGI setup, most invalid HTTP requests will be handled as 400 in Apache and it won't reach the actual PSGI app itself.

@kazuho
Copy link
Owner

kazuho commented Jan 2, 2015

@miyagawa Thank you for the answer. I am glad to know that you are not against the idea of pushing corrupt requests to PSGI handlers (a behavior that would not be strictly conformant to the spec.).

The issue is not a problem with Apache. It has it's own access logger which logs broken requests like 127.0.0.1 - - [02/Jan/2015:18:27:46 +0900] "a" 501 213 "-" "-". But most standalone PSGI servers do not implement their own logger but relies on PSGI middlewares for logging the accesses, that is why I wanted to hear opinion.

That said, it seems that we have three possible ways to resolve the issue (or let the user choose among them):

  • a. discard broken requests within Starlet, with Starlet logging some error to STDERR
  • b. discard broken requests within Starlet without any log at all
  • c. send broken requests to the PSGI handler

IMO the most simple approach is a. I do not think not logging anything in case of broken requests (b) is a good approach.

@hiratara Do you have problem with approach a?

@miyagawa
Copy link
Contributor

miyagawa commented Jan 2, 2015

But most standalone PSGI servers do not implement their own logger but relies on PSGI middlewares for logging the accesses, that is why I wanted to hear opinion.

That's an interesting point and I can totally see how developers want to log such requests in the same way as regular requests. I could see an extension or some sort to provide "error handler" app, aside from the main app.

@kazuho
Copy link
Owner

kazuho commented Jan 2, 2015

@miyagawa Thank you for the response.

I could see an extension or some sort to provide "error handler" app, aside from the main app.

IMO it should either be a flag that allows servers to pass incomplete requests to handlers, or an interface to specify a logger that logs both complete and incomplete requests to the servers. In other words, using a different path for errors only is IMO not a good option in this case.

@miyagawa
Copy link
Contributor

miyagawa commented Jan 2, 2015

If there's an option for separate error handler, when a developer wants to handle both complete and incomplete requests in the same app, they can just provide the same app as a main app and an error handler app. If they want to handle error cases differently, a completely different app with different set of middleware could handle them.

Because PSGI app is just a code ref, providing two apps for regular/error cases or one app that handles both with an internal condition, would both accomplish the same thing. It would be just a different user interface.

Which is better in terms of a server management experience would be a different question - I was thinking in terms of PSGI spec extension or common server options. I guess one application with some psgix vars in $env would be more flexible than providing two apps. On the other hand, the error handling is usually considered an optional thing, and if a server handles them with their default handler, being able to override that with a (different) PSGI application makes a bit more sense to me. It's not a strong preference and I'm just thinking out loud.

The definition of "error handler", however, could still be ambiguous - for instance we already have ErrorDocument middleware, although it is a bit weird interface, that has a error processor app option.

@kazuho
Copy link
Owner

kazuho commented Jan 2, 2015

@miyagawa

Because PSGI app is just a code ref, providing two apps for regular/error cases or one app that handles both with an internal condition, would both accomplish the same thing. It would be just a different user interface.
Which is better in terms of a server management experience would be a different question

Agreed. In case we are to instead use a flag for indicating incomplete requests (instead of using separate handlers), a user can split the requests by doing something like below (though AFAIK mount does not accept a coderef).

builder {
    mount { $_->[0]->{'plack.incomplete_request'} } => builder { ... };
    ...
}

As you said, it is a matter of preference.

That said the reason why I believe providing a separate path for handling incomplete requests is not a good idea is because the requirement is as you said:

developers want to log such requests in the same way as regular requests

If the spec. is to require a separate handler for handling incomplete (broken) requests, then applications would need to duplicate the logic that setups the logger. And properly duplicating the setup logic might be difficult if the loggers have complex internal states (e.g. the ouput might be a pipe).

OTOH if we extend the spec. to permit passing in incomplete requests then the configuration can be simple, like:

builder {
    enable 'Plack::Middleware::AccessLog', ...;
    skip_broken_requests; # we would need to figure out how to define this directive
    ...
}

Since access loggers are generally the outermost middleware in the application stack (this is due to the general requirement that loggers should log information of the raw input / output) most of the handlers / middlewares need not be adjusted to support incomplete requests.

Or the other option would be to provide a different method to define loggers for both complete and incomplete requests (the upside of the approach is that it does not need any change to existing plack-based applications / middlewares).

@kazuho
Copy link
Owner

kazuho commented Jan 2, 2015

PS. would it be better to discuss the issue on github.com/plack?

@miyagawa
Copy link
Contributor

miyagawa commented Jan 2, 2015

If the spec. is to require a separate handler for handling incomplete (broken) requests, then applications would need to duplicate the logic that setups the logger

I think that's hypothetical, and reusing the middleware code should be pretty straightforward with no code duplication.

(this is due to the general requirement that loggers should log information of the raw input / output) most of the handlers / middlewares need not be adjusted to support incomplete requests.

well i think that's the same with a separate app handler for errors, since you could reuse that middleware stack as well.

builder {
    mount { $_->[0]->{'plack.incomplete_request'} } => builder { ... };
    ...
}

The problem with this though is that you have to specify this flag twice, one from the app to server "Hey, I can handle (or ignore) such requests", and the other, server to app "This request is an incomplete request". Otherwise you're sending broken requests to a handler (which could be an open source app downloaded from CPAN or github) that are not aware of this flag.

Because PSGI spec does not have a way to hook into "server-launch" phase (there should be a PSGI spec issue for that) it could be totally unrelated config options (i.e. not self-contained).

If you specify two different apps, that's under a complete control of developers.

PS. would it be better to discuss the issue on github.com/plack?

Yep, sounds good. For this specific issue I don't recommend waiting for making a specification though - you can do whatever you want.

@kazuho
Copy link
Owner

kazuho commented Jan 2, 2015

The problem with this though is that you have to specify this flag twice, one from the app to server "Hey, I can handle (or ignore) such requests", and the other, server to app "This request is an incomplete request". Otherwise you're sending broken requests to a handler (which could be an open source app downloaded from CPAN or github) that are not aware of this flag.
Because PSGI spec does not have a way to hook into "server-launch" phase (there should be a PSGI spec issue for that) it could be totally unrelated config options (i.e. not self-contained).

If that is the reason you are opposed to optionally allowing ordinary PSGI handler to accept incomplete requests, I would suggest using callable objects (or blessed subref) for passing information from the app to the server.

For example, Plack::Builder could return a callable object with accept_broken_request flag set if the accept_broken_requests flag is used (as shown in the example below).

builder {
    accept_broken_requests;
    ...
}

@miyagawa
Copy link
Contributor

miyagawa commented Jan 3, 2015

Honestly, the reason for preferring separate handlers is that it doesn't require any PSGI spec update - your server just supports two different apps. Of course it would be nice if there are common/standardized ways for that across multiple servers, but we can work around that if we want to have that in say, Starman.

Your requirement for "duplicate middleware, might not be able to share resource across apps" are valid, but I think it's possible to work around and i don't want to optimize for non-common use case (we have to determine that first though).

@hiratara
Copy link
Contributor Author

hiratara commented Jan 3, 2015

Honestly, I wanted to stop any logs for invalid requests. However, I found that it was better that Starlet (or a PSGI middleware) logged broken requests if we wanted.

I'll keep watching plack/psgi-specs#37 .

@miyagawa
Copy link
Contributor

miyagawa commented Jan 3, 2015

Yeah, the option to handle them would be nice, but just ignoring such requests by default sounds sensible too.

btw do they pass even if you put Starlet behind nginx, or is it directly connecting to the backend?

@hiratara
Copy link
Contributor Author

hiratara commented Jan 3, 2015

I use Starlet behind nginx and Starlet gets lines which contain spaces in PATH_INFO.

Though this issue is worth considering, I think some setting converts these spaces into %20.

@kazuho
Copy link
Owner

kazuho commented Jan 3, 2015

@hiratara

I use Starlet behind nginx and Starlet gets lines which contain spaces in PATH_INFO.
Though this issue is worth considering, I think some setting converts these spaces into %20.

+1. If Nginx is sending invalid requests to upstream then it should be a problem of Nginx or its configuration.

@miyagawa

Honestly, the reason for preferring separate handlers is that it doesn't require any PSGI spec update - your server just supports two different apps. Of course it would be nice if there are common/standardized ways for that across multiple servers, but we can work around that if we want to have that in say, Starman.

My understanding was that the discussion was how we should extend the PSGI spec., considering th e fact that there is need to update the spec even if you took the two-different-apps approach, since the second app needs to accept incomplete requests (that do not conform to the PSGI spec.).

Considering the requirement to log broken requests out of scope of the PSGI spec. is fine for me. Servers can always implement loggers outside of PSGI.

@miyagawa
Copy link
Contributor

miyagawa commented Jan 3, 2015

since the second app needs to accept incomplete requests (that do not conform to the PSGI spec.).

Well I'm not sure; accepting a pseudo (or incomplete) PSGI environment that represents such broken requests can be still said a "subset" of PSGI. I mean, extending the PSGI to support broken non-HTTP semantic pretty much sounds sub-optimal, if that ends up with PSGI with all these fields that are currently be defined MUST being more relaxed.

Considering the requirement to log broken requests out of scope of the PSGI spec. is fine for me. Servers can always implement loggers outside of PSGI.

Sure. It's just that you might want to reuse PSGI middleware and handlers for that - and for most of the middleware such subset would be fine as well.

@kazuho
Copy link
Owner

kazuho commented Jan 3, 2015

@miyagawa

I mean, extending the PSGI to support broken non-HTTP semantic pretty much sounds sub-optimal, if that ends up with PSGI with all these fields that are currently be defined MUST being more relaxed.

I understand the point.

Sure. It's just that you might want to reuse PSGI middleware and handlers for that - and for most of the middleware such subset would be fine as well.

IIRC most web servers are layered like: application-logic / protocol-handler / logger (from upper to lower level). There is not much you can do other than logging the connection for a broken request. So instead of having a second handler, just providing a interface to set a logger would be more logical IMO if we are to handle the issue outside of PSGI.

Adding a interface to some Plack-based loggers that returns a subref to the logging function, and calling it directly from servers may be a good approach.

@kazuho kazuho closed this in 40bf865 Jun 27, 2015
@kazuho
Copy link
Owner

kazuho commented Jun 27, 2015

The fix has been merged to master (test has been cherry-picked as 36a7c58, the fix has been done in the other way).

note: the question to how we should log such broken requests remains open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants