Previous Thread
Next Thread
Print Thread
Rate Thread
[7.5.8] Better URL Sanitization for SEO #319241
03/18/2014 2:20 AM
03/18/2014 2:20 AM
Joined: Jul 2001
Posts: 1,170
California
isaac Offline OP
$coffee=code(true);
isaac  Offline OP
$coffee=code(true);
Joined: Jul 2001
Posts: 1,170
California
Requirements:
1. Valid UBB.Threads 7.5.8 install and license.
2. The PATH_INFO environmental variable must be available for this feature to function properly.

About:
This mod converts UBBT's Spider-Friendly URL string to lower-case and strips it of HTML tags. It then uses php's regular expression "replace" to replace everything not a letter or a number with dashes (it also replaces spaces). Next, it replaces all double-dashes with a single dash (if the topic title string had "nom – nom nom" that previously would be four dashes – now it's just two) and then finally, it trims any extra dashes from the beginning and end of the string.

Example 1 -
BEFORE:
ubbthreads.php/topics/45/I_Like...._TURTLES!!!!.html
AFTER:
ubbthreads.php/topics/45/i-like-turtles.html

Example 2 -
BEFORE:
ubbthreads.php/topics/44/LoL,_?,_<,_>,_",_&,_,,_+,_|,_!,__,_#,_\,_^,_{,_},_=,_:,.html
AFTER:
ubbthreads.php/topics/44/lol.html

Warning:
If your forum's language uses any non UTF-8 characters, such as the Swedish å, ä and ö, they will be stripped. See Example 2 above.

Notes:
Using dashes/hyphens (-) rather than underscores (_) for spider-friendly URLs is the recommended format to follow and is this current standard rather than use the older style of underscores-for-spaces, which UBB.Threads uses. Some further reading on the use of hyphens (-) vs underscores (_):
http://www.ecreativeim.com/blog/2011/03/seo-basics-hyphen-or-underscore-for-seo-urls/

Quote
The short answer is that you should use a hyphen for your SEO URLs. Google treats a hyphen as a word separator, but does not treat an underscore that way. Google treats and underscore as a word joiner — so red_sneakers is the same as redsneakers to Google. This has been confirmed directly by Google themselves, including the fact that using dashes over underscores will have a (minor) ranking benefit.

Again, SEO URLs should use hyphens to separate words. Do not use underscores, do not try to use spaces, and do not smash all the words together intoonebigword. As of 2012, dashes are still the best way to optimize your SEO URLs.


A video answering the hyphen vs underscore SEO URL question by Matt Cutts.
Matthew "Matt" Cutts leads the Webspam team at Google, and works with the search quality team on search engine optimization issues
YouTube Video: https://www.youtube.com/watch?v=AQcSFsQyct8


How-To:
Confirm that Spider-friendly URLs are turned on:
Control Panel > Primary Settings > Advanced Options > Enable Spider-friendly URLs TICK-BOX
Optional: Tick also the "Enable HTML Extension" box.


FIND IN libs/ubbthreads.inc.php:
Code
	$title = ubbchars($title);
	$title = str_replace(' ', '_', trim($title));
	$title = str_replace( '%', '_', $title );
	$title = substr($title, 0, 30);


REPLACE WITH:
Code
	//SEO-friendly URL String Converter
	//ex) this is an example -> this-is-an-example
	$title = str_replace(array("&amp;","&nbsp;"), " ", $title); //replace space and ampersand markup
	$title = str_replace(array("&quot;","'"), "", $title); //replace quote markup
	$title = mb_convert_case($title, MB_CASE_LOWER, "UTF-8"); //convert to lowercase
	$title = preg_replace("#[^A-Za-z0-9]+#", "-", $title); //replace everything non-alphameric with dashes
	$title = preg_replace("#(-){2,}#", "$1", $title); //replace multiple dashes with one
	$title = trim($title, "-"); //trim dashes from beginning and end of string if any
	$title = substr($title, 0, 70);


done.

---

Further reading on why this simple modification is necessary in UBB.threads 7.5.8:
"Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly SEO URLs"
http://www.ubbcentral.com/forums/ubbthreads.php/topics/255500

Last edited by id242; 07/27/2014 5:04 PM.
Sponsored Links
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #319242
03/18/2014 3:30 AM
03/18/2014 3:30 AM
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Gizmo Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Gizmo  Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Great hack; for someone with a charset which uses non-latin characters they could convert those to latin based characters with another replace after:
Code
    $title = str_replace("&nbsp;", " ", $title);


Do something like:
Code
    $title = str_replace(array("å","ä","ö"), array("a","a","o"), $title);


UBB.Dev - Putting Dev into UBB.threads
Company: VNC Web Services - UBB.threads Scripts and Scripting, Install and Upgrade Services, Site and Server Maintenance.
Forums: A Gardeners Forum, Scouters World, and UGN Security
UBB.Threads: My UBB Themes, UBB.Sitemaps
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #320575
07/27/2014 3:22 PM
07/27/2014 3:22 PM
Joined: Jul 2001
Posts: 1,170
California
isaac Offline OP
$coffee=code(true);
isaac  Offline OP
$coffee=code(true);
Joined: Jul 2001
Posts: 1,170
California
I just noticed that quotes were being stored to the ubbt_TOPICS/TOPIC_SUBJECT table as markup ("&quot;") rather than absolutes. This is fine in most places throughout UBBT, but in this one location, it's not producing the identical URL we're looking for. It creates a new link for identical content, ie; raising "duplicate content" flags for "dumb" spiders/crawlers.

I've updated the OP by adding a new line to convert those quote markups.

Sometime in the future, I'll look in to why the "Link to this individual post" link (the post-icon at the top-left of each post/reply) is pulling from ubbt_TOPICS/TOPIC_SUBJECT rather than ubbt_POSTS/POST_SUBJECT -- and also why markup is being stored in the ubbt_TOPICS/TOPIC_SUBJECT table rather than absolute characters. It may just be something moving forward to a future feature "SD" mentioned on 05/10/2014, relating to getting away from being able to change the topic mid-discussion... or it may just be a UBBT bug?

EDIT: The post link pulls from ubbt_TOPICS/TOPIC_SUBJECT rather than ubbt_POSTS/POST_SUBJECT as to avoid creating new URLs for the same page content.

Last edited by id242; 07/27/2014 4:26 PM.
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #320576
07/27/2014 4:29 PM
07/27/2014 4:29 PM
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Gizmo Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Gizmo  Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Originally Posted by id242
relating to getting away from being able to change the topic mid-discussion
I've always been against changing the topic mid-discussion; in my opinion if the topic needs changed in a thread, it should probably need to be it's own thread too (in most cases). I believe that's why Rick, at one point, added the ability to rename the entire topic.


UBB.Dev - Putting Dev into UBB.threads
Company: VNC Web Services - UBB.threads Scripts and Scripting, Install and Upgrade Services, Site and Server Maintenance.
Forums: A Gardeners Forum, Scouters World, and UGN Security
UBB.Threads: My UBB Themes, UBB.Sitemaps
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #320577
07/27/2014 5:23 PM
07/27/2014 5:23 PM
Joined: Jul 2001
Posts: 1,170
California
isaac Offline OP
$coffee=code(true);
isaac  Offline OP
$coffee=code(true);
Joined: Jul 2001
Posts: 1,170
California
UPDATED, once again:

Tightened up the code a bit.

I've also adjusted the $title string length so that it will show the full 50 chars topic title in the URL, instead of just the truncated 30.

The default subject title is 50 chars. My setting of "70" should more than cover that.

The main intentions of this modification are to 1) better sanitize the URL, and 2) improve the URL for the user to know where he's about to arrive before clicking the link... and SEO.

Sponsored Links
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #321036
09/04/2015 9:25 PM
09/04/2015 9:25 PM
Joined: Nov 2003
Posts: 331
UK
M
Mark_S Offline

Beta Tester
Mark_S  Offline

Beta Tester
M
Joined: Nov 2003
Posts: 331
UK
Is this in the newer versions by default or is it a users choice ?
I'm on 7.5.8 or do i wait for v6 my character set is currently set to iso-8859-1 in UBB language file.

My conversion didn't go as expected on my dev board so ive not changed on my live board. . . Click Me


BOOM 7.6.+ rocks....
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #321039
09/05/2015 6:33 AM
09/05/2015 6:33 AM
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Gizmo Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Gizmo  Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
This is the latest build of the url's, the same that's included in 7.6.0 (the latest Snapshot build is Fri Aug 28 2015).

As for your issue with converting to UTF8, aren't some of the characters used on your forum multibite? If so, you can't just move over to UTF8 as it doesn't support those characters. We've written a Wiki article regarding this issue at UTF-8 vs Latin-1 (ISO-8859-1), which also has links to several character set related issues.

I replied to your thread at Central for the second issue, as to not derail this thread.

Last edited by Gizmo; 09/05/2015 7:16 AM.

UBB.Dev - Putting Dev into UBB.threads
Company: VNC Web Services - UBB.threads Scripts and Scripting, Install and Upgrade Services, Site and Server Maintenance.
Forums: A Gardeners Forum, Scouters World, and UGN Security
UBB.Threads: My UBB Themes, UBB.Sitemaps
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #321040
09/06/2015 9:56 AM
09/06/2015 9:56 AM
Joined: Nov 2003
Posts: 331
UK
M
Mark_S Offline

Beta Tester
Mark_S  Offline

Beta Tester
M
Joined: Nov 2003
Posts: 331
UK
Thanks for the Feedback Gizmo, appreciated.
I'm going to implement the code above, as i think its hurting my search results, and not sure when v6 will be installed on my live forum.
As always its good to have you guys as guidance.


BOOM 7.6.+ rocks....
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #321041
09/06/2015 3:26 PM
09/06/2015 3:26 PM
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Gizmo Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Gizmo  Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
The 7.6.0 snapshots should be stable enough to run on your forums; it's what we've been running here for some time now, if you wanted to test it with a live group.


UBB.Dev - Putting Dev into UBB.threads
Company: VNC Web Services - UBB.threads Scripts and Scripting, Install and Upgrade Services, Site and Server Maintenance.
Forums: A Gardeners Forum, Scouters World, and UGN Security
UBB.Threads: My UBB Themes, UBB.Sitemaps
Re: [7.5.8] Better URL Sanitization for SEO [Re: isaac] #321044
09/06/2015 5:32 PM
09/06/2015 5:32 PM
Joined: Nov 2003
Posts: 331
UK
M
Mark_S Offline

Beta Tester
Mark_S  Offline

Beta Tester
M
Joined: Nov 2003
Posts: 331
UK
Im considering doing that Gizmo.
I have time to give it a go and pick up the pieces too.


BOOM 7.6.+ rocks....
Sponsored Links
Url fails #321216
09/27/2015 7:50 AM
09/27/2015 7:50 AM
Joined: Nov 2003
Posts: 331
UK
M
Mark_S Offline

Beta Tester
Mark_S  Offline

Beta Tester
M
Joined: Nov 2003
Posts: 331
UK
On my live forum.
Running 7.58 with ID242 seo hack in place.

Im on a mobile phone, using a hotel wifi. The following link fails.
Code
http://www.wikiwirral.co.uk/forums/ubbthreads.php/topics/984942/106-to-112-bentinck-street-help-photos.html#Post984942


The actual subject is thus

106 to 112 Bentinck Street help / photos

If i pull it back to this by editing in my browser

http://www.wikiwirral.co.uk/forums/ubbthreads.php/topics/984942

Will get me there no problem

But the original via my mobile when clicking on it eventually throws this at me

http://192.168.250.1/info

The current connection is thus. "PALACE_WIFI"(50:17:ff:f4:94:00)

IP address: 192.168.248.183
Lease duration: 900 sec
Gateway: 192.168.248.1
Netmask: 255.255.252.0
DNS1: 8.8.8.8
DNS2: 195.10.102.11
Server IP: 192.168.248.1
Link speed: 144 Mbps
Hidden SSID: No

Just feedback. As this hasn't happened to me before. As im away i can't compare with a desktop pc.

The preview button, trunkated the url so its in a code block now.

The recent post island and New Topic island and the topic subject pull the same extra long broken url.

However i have a most viewed custom hack post island with the following url
http://www.wikiwirral.co.uk/forums/ubbthreads.php/topics/984942.html

And thats working just fine.

Just feed back, incase there is an issue with / in the subject line.


BOOM 7.6.+ rocks....
Re: Url fails [Re: Mark_S] #321217
09/27/2015 3:04 PM
09/27/2015 3:04 PM
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Gizmo Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Gizmo  Offline

UBB.Dev / UBB.Wiki Owner
Time Lord
Joined: Jan 2000
Posts: 5,938
Portland, OR, USA
Well, accessing the URL from a desktop works without issue, and I tested on my cell as well without any issue. Likely the cause is the hotel wifi, if you've never experienced the issue before.

As for getting the URL http://192.168.250.1/info it's a local IP to the network you're connected to (as 192.x is a private range that isn't assigned online, just like 10.x).

FYI, the code that is in Isaac's mod is what's in 7.6.0, which means it's also what we've been running here for months (prior to that we where running the mod, so years); if you've never experienced the issue here then I think it'd be safe to say it's probably the hotel's connection.


UBB.Dev - Putting Dev into UBB.threads
Company: VNC Web Services - UBB.threads Scripts and Scripting, Install and Upgrade Services, Site and Server Maintenance.
Forums: A Gardeners Forum, Scouters World, and UGN Security
UBB.Threads: My UBB Themes, UBB.Sitemaps
Re: Url fails [Re: Mark_S] #321218
09/27/2015 3:16 PM
09/27/2015 3:16 PM
Joined: Jul 2001
Posts: 1,170
California
isaac Offline OP
$coffee=code(true);
isaac  Offline OP
$coffee=code(true);
Joined: Jul 2001
Posts: 1,170
California
Originally Posted by Mark_S
On my live forum.

Code
http://www.wikiwirral.co.uk/forums/ubbthreads.php/topics/984942/106-to-112-bentinck-street-help-photos.html#Post984942


...the original via my mobile when clicking on it eventually throws this at me

http://192.168.250.1/info

Originally Posted by Mark_S
The recent post island and New Topic island and the topic subject pull the same extra long broken url.


It looks like the hotel's wifi connection to the outside network (internet) had a hiccup or just timed-out.

Your error could have happened to any other internet website at that exact time.

Originally Posted by Mark_S
Just feed back, incase there is an issue with / in the subject line.


From the URL you've posted, it doesnt look like the topic's "slash" had anything to do with the problem you had. The topic's "slash" didnt even make it in to the url.

The sixth line of sanitization code in the OP says: Convert anything that is not an A-z, a-z, 0-9 character (ie, "A-Za-z0-9") to a dash. It basically sanitizes the whole topic, for inclusion in to the URL.


Donate Today!
Donate via PayPal

Donate to UBBDev today to help aid in Operational, Server and Script Maintenance, and Development costs.

Please also see our parent organization VNC Web Services if you're in the need of a new UBB.threads Install or Upgrade, Site/Server Migrations, or Security and Coding Services.
Recommended Hosts
We have personally worked with, and recommend, the following Web Hosts:
· Stable Host
· Blue Host
· Interserver.net
Visit Us on Facebook
Member Spotlight
isaac
isaac
California
Posts: 1,170
Joined: July 2001
Show All Member Profiles 
Forum Statistics
Forums64
Topics37,448
Posts293,484
Members13,793
Most Online1,498
Mar 17th, 2017
Top Posters(All Time)
AllenAyres 25,587
JoshPet 11,330
Rick 8,373
LK 7,396
Lord Dexter 6,503
Gizmo 5,938
Greg Hard 5,533
Top Posters(30 Days)
isaac 4
Today's Statistics
Currently Online 750
Topics Created 0
Posts Made 0
Users Online 0
Birthdays 19
The UBB.Developers Network (UBB.Dev/Threads.Dev) is ©2000-2018 VNC Web Services

 
Powered by UBB.threads™ PHP Forum Software 7.6.2
(Preview build 20180611.dev)
Page Time: 0.048s Queries: 15 (0.014s) Memory: 3.3363 MB (Peak: 3.5582 MB) Zlib enabled. Server Time: 2018-06-20 09:37:47 UTC