3

My website has pages and other content with UTF-8 encoding. For HTML, setting the encoding in a meta tag is no problem. However, I also have raw text files with UTF-8 encoding that aren't displayed correctly, such as appearing as ×. I've considered adding a byte-order mark at the start of such files, but I'd prefer not to since they aren't always well supported. I followed the instructions in this other question, but it had no effect. This is the HTTP response header:

HTTP/1.1 200 OK
Date: Sat, 12 Aug 2017 15:41:04 GMT
Server: Apache/2.4.10 (Debian)
Last-Modified: Wed, 09 Aug 2017 19:24:33 GMT
ETag: "c04c-5565707a34966"
Accept-Ranges: bytes
Content-Length: 49228
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive

I was hoping to see Content-Type: text/plain; charset=utf-8. How can I get reliable UTF-8 encoding for these URIs?

Brent
  • 181

3 Answers3

9

Content-Type is not sent for 304 Modified responses because there is no content body for such a response.

Look at the 200 response type and you should see this. Use Ctrl + F5 to force a refresh and a 200 response rather than revalidating the cached response with a 304 response.

You then updated your question to include a 200 response, but I would expect that always to have a Content-Type: text/plain header or equivalent (even if the character set is not included) but that is not in your example, so not sure you have all the details in that?

Regardless, the correct way to set this is to add the following to your apache config:

#Set the correct Char set so don't need to set it per page.
AddDefaultCharset utf-8
#for css, js, etc.
AddCharset utf-8 .htm .html .js .css

The first (AddDefaultCharset) will set the charset for text/plain and text/html resources.

The second (AddCharset) requires mod_mime and will set the charset for other types based on file extension. Javascript files are sent with content type of application/javascript and CSS files are sent with content type of text/css so are not picked up by the AddDefaultCharset setting. The .htm and .html files don't really need to be in this as will be picked up by default but no harm being explicit.

5

I fixed this problem with by adding these lines to 'apache2.conf':

AddType text/plain .yml
AddDefaultCharset utf-8

This was some time ago as of writing this answer. Recent Apache installations may have utf-8 already set as the default.

Brent
  • 181
0

Extending the anwser written by @Barry-Pollard, since AddCharset requires mod_mime (or on many configuration, it is loaded as mime_module by default), we can better use the following code. Wrapping AddCharset with IfModule ensures the apache won't throuw out error if mime_module isn't loaded by any reason.

<IfModule mime_module>
    AddCharset UTF-8 .js .css
</IfModule>