Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purge cache on file change #136

Open
nilshoerrmann opened this issue Jul 8, 2016 · 15 comments
Open

Purge cache on file change #136

nilshoerrmann opened this issue Jul 8, 2016 · 15 comments
Assignees
Milestone

Comments

@nilshoerrmann
Copy link
Contributor

It often happens that clients update images in the backend: similar but edited file, same name, but different. JIT doesn't notice this change and still serves the outdated cache file.

I know that using a different name circumvents this issues but I'd expect JIT to purge the cache for a file when it is edited or deleted by Symphony.

@michael-e
Copy link
Member

michael-e commented Jul 8, 2016

Even if JIT created a new cache file, the client (browser) would still obey the cache headers of the old file. You can not "unset" the cache headers of a file which has already been delivered to a client. In order to ensure that the client actually requests the new file, you either need a new name (e.g. with the Unique Upload Field) or the cache for your files set to 0 set all response headers of your files so that they are never cached (not recommended).

@nilshoerrmann
Copy link
Contributor Author

Well, yes, but what you describe is a secondary issue that can be solved by clearing the browser cache.
The current JIT behaviour results in incorrectly served images even for first time users because the image cache on the server is not updated when the source file changes.

The extension adds the original file name to the cache file name – e. g. 712fc16f4035f77e304e3fd97b5c0d0d_logo_foerdererkreis-hbk.png – so it shouldn't be too hard to remove the related cache files.

@michael-e
Copy link
Member

Have you ever seen a user clearing the browser cache?

You are right when you say that it is incorrect resp. unexpected behaviour, but I think that a fix would introduce complexity while being pointless at the end of the day. You might prove me wrong. :-)

@nilshoerrmann
Copy link
Contributor Author

Have you ever seen a user clearing the browser cache?

Yes, actually I've seen it many times. We actually have clients telling us that they cleared the cache before we instructed them to do so.

You are right when you say that it is incorrect resp. unexpected behaviour

It's not unexpected behaviour. The served file is simply wrong. That's a bug, not a personal taste :)

@nilshoerrmann
Copy link
Contributor Author

For reference, Cacheable Data Source clears the related cache on save.

@michael-e
Copy link
Member

We actually have clients telling us that they cleared the cache

I meant "all the other users", not clients. Or does your website contain a warning like "please drop your cache every time you request a page" (which wouldn't help with proxies anyway).

The served file is simply wrong.

One might debate that. If you identify a resource using a URI/URL, are you really "technically allowed" to change that resource to your liking? A newer version of a person's portrait? No prob. A completely different image with same meaning? Sounds reasonable. But a different image with completely different content/message? Are you sure that this is the intention of the URI/URL concept? :-)

Anyway. Let's assume that JIT delivers "the wrong file" (because someone replaced the original file). So what? It won't help to serve the right file unless you can live with the caches in browsers (and proxies alike). Do you know your cache headers?

If JIT serves a new file, you will see the correct image when you clear your browser cache. Now will you believe that you solved the problem? How funny. Many of your visitors out there will see the old image for a week or so (depending on your cache headers). And this means: Serving the right file has led you to the wrong track. You are even fooling your client/authors, because you make them think that they solved the problem by clearing their browser's cache.

"Fixing" JIT's behaviour does not solve your problem.

@nilshoerrmann
Copy link
Contributor Author

"Fixing" JIT's behaviour does not solve your problem.

It does. Serving the correct image from the server.

I'm never in control of what the site visitor sees – there is the browser cache, there are network connectivity issues and such. I'm aware of all this. But I have control over my server, and my server is serving something wrong. If I "know my cache headers" has nothing to do with this at all.

One might debate that. If you identify a resource using a URI/URL, are you really "technically allowed" to change that resource to your liking? A newer version of a person's portrait? No prob. A completely different image with same meaning? Sounds reasonable. But a different image with completely different content/message? Are you sure that this is the intention of the URI/URL concept? :-)

This as well is a discussion worth to pursue, but it has nothing to do with the case made in this issue: If the user changes an image for whatever reason, even if you don't approve it, the server is not reacting correctly.

Sorry, Michael :)

@nitriques
Copy link
Member

If you identify a resource using a URI/URL, are you really "technically allowed" to change that resource to your liking?

No, you are not if you want to adhere to the spec!

If the user changes an image for whatever reason, even if you don't approve it, the server is not reacting correctly.

But that's the reality: the user did change it. Try to make your client understand a spec is way to much for them.

My proposition is this: If the filemtime of the original file is later than the cache, do not send/use the cache.

This does not require magic on save and should cover most cases. I won't go into invalidating proxies cache, as this is too complex. But at least, jit would send the correct thing, http headers included so any request won't respond with a 304 with with a 200.

@nitriques nitriques added the Bug label Jul 19, 2016
@nitriques nitriques added this to the 2.1.0 milestone Jul 19, 2016
@nitriques nitriques self-assigned this Jul 19, 2016
@michael-e
Copy link
Member

My proposition is this: If the filemtime of the original file is later than the cache, do not send/use the cache.

So the cache would no longer be used for this image?

@nitriques
Copy link
Member

nitriques commented Jul 19, 2016

@michael-e exactly: it would even get overwritten since we would act as if it did not exists.

@michael-e
Copy link
Member

So a new cache file would be written? Sounds good.

@nilshoerrmann
Copy link
Contributor Author

My proposition is this: If the filemtime of the original file is later than the cache, do not send/use the cache.

I like that idea.

@nitriques
Copy link
Member

So a new cache file would be written?

Kindof: the existing one will get overwritten.

I'll implement it ASAP.

nitriques added a commit that referenced this issue Jul 28, 2016
The way we handled HEAD request was

1. Different from the GET request, since it lacked some checks
2. Way to general for our needs

This commit refactors the old things with a single brand new function
that will

1. Make sure the request is a 200
2. Return only cache related headers, since that's the goal

Re #136
Cc #138
Cc #130
Cc #137
nitriques added a commit that referenced this issue Jul 28, 2016
This commit cleans up the code and fixes a bunch of issues.

* Cache validation

There was a bug, #137, which would made the cache unusable with external
images. This has been fixed. The function also makes sure to validate
that the original file, either remote or local, has not changed since
the last modification of the cache file itself. If the cache file is no
longer valid, it gets deleted. This also fixes #136.

* Last modified value

There was a problem describe in #138 where a second request to a url
with proper cache heading would still send a 200, because the
Last-Modified value was read from a different source. This commit
changes this by doing the following.

1. Always use a local mtime: either the original local file or our
tempfile created from the http request.
2. If cache is enabled, always use the cache file mtime, even when the
request creates the cache file. This insure repeatable results for all
subsequent requests.

This brings a side effect that is needed:
before, we used to touch the cache file when found valid and we must
need to stop this to be able to get repeatable results.

---

Throughout the process, values are "exported" into the parameter array
when certain condition are met. This insure that the values sent to the
user will be the right one.

Sorry for the big commit, I did my best to split things up :)

Fixes #137
Fixes #136
Fixes #138
@nitriques
Copy link
Member

This, hopefully, should be all fixed. Relevant part starts here a917e67#diff-d43a412e5f91261a65df2bc0b664df46R128

@nitriques
Copy link
Member

@nilshoerrmann The proposition made (and lots of other stuff) has been implemented in integration. If you could confirm that the issue is resolve, I would apreciate it. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants