3

I'm using Postgres 9.5. I have a table with a column that records URLs. Sometimes the URL has a query string and sometimes it does not. I would like to extract the URL, minus any query string, so I came up with:

select substring(url, 0, case position('?' in url) when 0 then length(url)+1 else position('?' in url) end) 
from article;

This seems a little wordy and I was wondering if there's a more concise way to do this. My table column is of type TEXT.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
Dave
  • 753
  • 10
  • 22
  • 39

3 Answers3

6

You can use regexp_replace() to replace everything after the first ? with nothing:

select regexp_replace(url, '\?.*$', '')

The following example:

with data (url) as (
   values 
    ('http://foo.bar/some/page'),
    ('http://foo.bar/some/page?id=42&name=Marvin')
)
select url, regexp_replace(url, '\?.*$', '') as clean_url
from data;

returns:

url                                        | clean_url               
-------------------------------------------+-------------------------
http://foo.bar/some/page                   | http://foo.bar/some/page
http://foo.bar/some/page?id=42&name=Marvin | http://foo.bar/some/page
4

I suggest split_part() for this, certainly more concise and simpler and substantially cheaper than regular expressions, too:

SELECT split_part(url, '?', 1) AS stripped_url FROM tbl;

Does exactly what you want. If there is no '?', the whole string is the result.

db<>fiddle here

Related:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
1

Or, you could do it using a simple CASE statement.

I'm not sure of the performance implications, but I have a feeling that CASE should be more performant - will try and test when I get home - bank holiday today!

Check the fiddle here:

WITH my_data (url) AS 
(
   VALUES 
    ('http://foo.bar/some/page'),
    ('http://foo.bar/some/page?id=42&name=Marvin')
)
SELECT
  CASE
    WHEN POSITION('?' IN url) > 0 THEN
      LEFT(url, POSITION('?' IN url) - 1)
    ELSE
      url
  END AS my_result
FROM my_data;

Result:

my_result
----------------------------
http://foo.bar/some/page
http://foo.bar/some/page

This solution relies on no other assumption than that the requirement that the deletion takes place from the first question mark till the end, which is same assumption as the other solution makes.

Vérace
  • 30,923
  • 9
  • 73
  • 85