NamePros Status Updates

Paul · Aug 11, 2015

2016-05-14 From now on, we'll be posting status updates on our new status page. More info

Original post:

We'll be posting status updates and information about notable changes in this thread. Subscribe to receive alerts.

Dates and times are in UTC.

* Please report any issues to Technical Support. Thanks.

Paul · Aug 11, 2015

2015-08-11 from 00:17 to 16:32

We've been deploying fixes over the past 16 hours for a handful of rare bugs. Some users may have had trouble uploading new avatars during this time.

Notable fixes:

Avatar uploads were occasionally failing due to a caching inconsistency.
Like/dislike/think/unlike links would rarely time out as a result of a database transaction deadlock.
Report submission occasionally resulted in a scary error. The reports still submitted fine, though.

Paul · Sep 10, 2015

2015-09-10

Over the past few days we've been tweaking the pattern-matching code ("regexes") that we use to detect spam for several reasons:

We block spambots quite effectively, but we're seeing an increase in spam from humans who take the time to evade the spam filters. This smarter code will make their job more difficult.
Several members have been reporting false positives, particularly for posts with complex URLs. The new code does a better job of distinguishing normal URLs from spam-like content.
It's theoretically possible to launch a Denial of Service attack ("DoS") against a complex regex. We've reworked our regexes to be more efficient and avoid the typical caveats.

We'd appreciate your continued feedback (Contact Technical Support) whenever you encounter an ambiguous message stating that your content has been blocked or moderated for being "spam-like", or is otherwise sent to the moderation queue unnecessarily. We can use information about these incidents to avoid future false positives.

Note that new members will still be unable to post links; this is unrelated.

Paul · Sep 27, 2015

2015-09-27

We built a new feature for NamePros: aggregate forums! This feature allows us to display threads from a variety of different forums in a single place: an aggregate forum.

The immediate benefit of these new aggregate forums is that more content will be conveniently accessible and more visible with increased exposure for threads in other areas. Rather than having to view multiple different pages to see related content, it can now all be grouped into one place with an aggregate forum.

Moving forward, it will also allow us to create more sub-forums to better organize threads without hiding those threads in forums that no one visits. For instance, we created two new discussion forums: Niche Domain Discussion and Numeric Domain Discussion. Discussions taking place in these new niche and numeric forums will also appear under the first aggregate forum below.

New aggregate forums

We'll be making additional tweaks over the next few days.

Paul · Oct 6, 2015

2015-10-05 from 15:29 to 16:08

At 15:29 yesterday we were alerted of a widespread CloudFlare outage. CloudFlare provides the network that we use to distribute our content around the world as efficiently as possible. When CloudFlare goes down, so does NamePros, along with millions of other websites. Usually these outages are regional, but a significant technical issue with one of CloudFlare's internet providers caused many of their datacenters to go offline. For the time being they've deactivated the problematic ISP by taking the connected datacenter offline, which brought the other datacenters back online at 16:08.

As of 17:06 on October 6 (a day later), we're still seeing occasional issues, but the problem has mostly been resolved. We're detecting rare connectivity issues from Japan and Germany. North America seems stable.

The initial outage affected NamePros for 39 minutes and disrupted about 30% of traffic. Visitors in affected locations were unable to connect to NamePros for the duration of the downtime.

This sort of event is very rare, but it does happen from time to time, regardless of ISP, datacenter, or network. In the future, we may look into providing an alternative means of access to NamePros should routing issues occur.

Paul · Dec 3, 2015

2015-12-03

We've given the trade review system a makeover:

Some minor bugs have been fixed.
It's now possible to report feedback you receive.
You can now delete feedback you've written.
There's a separate page for the feedback summary so that members with hidden profiles still have visible feedback.
A short feedback summary is shown next to each post on marketplace listings.
Your feedback counts are now visible on your sidebar member card.

We've tested the changes thoroughly, but there's always a chance that we missed something. Please let us know via the support button in the lower right corner if you notice any bugs.

P.S. Posts weren't working for a minute or two while the servers were updating; sorry about that.

Update 2015-12-04: Feedback score percentages are now rounded.

Paul · Dec 16, 2015

2015-12-16

We received a security alert from the XenForo team about 12 hours ago notifying us of a potential vulnerability related to profile posts. We immediately disabled profile posts. After assessing the patch, it appears that the impact of the vulnerability was greatly limited by our own security modifications to the relevant code. We've applied a customized version of the patch provided by XenForo for increased threat mitigation. As of 4:30 AM UTC, profile posts have been re-enabled.

Paul · Dec 18, 2015

2015-12-18

Due to an incomplete update deployment, some aspects of the website didn't work as expected between 2015-12-16 03:31 UTC and 2015-12-16 02:56 UTC. Notably affected was the trade feedback system. The impact should have been relatively minor; however, if you are aware of any inconsistencies that have resulted from the bug, please let us know.

Paul · Dec 20, 2015

2015-12-20

We upgraded our Elasticsearch cluster overnight. Normally such updates don't result in any downtime, but this was a major upgrade that required shutting down the entire cluster, rather than just one server at a time. While the cluster was down, search functionality was unavailable, and users may have seen harmless errors while posting new content or performing searches. A variety of other services had to be upgraded alongside Elasticsearch to avoid compatibility issues; these upgrades were done on a rolling basis and did not affect the uptime of any parts of the site.

As of about 13:00 UTC, everything is tested and stable, but we're still rebuilding our search database. While it rebuilds, features that rely on Elasticsearch will have incomplete results. Notable, certain tabs on member profiles, tags, and the traditional search functionality will be affected. The rebuild process will finish within a few hours. Although results will be incomplete, you should not see any error messages.

This is a significant update that introduces many internal changes, so we'll continue to monitor for errors.

Paul · Feb 17, 2016

2016-02-17

The site was down globally from 10:36 UTC to 10:48 UTC. According to our logs, a memcached server crashed and automatically rebooted. We'll be reworking the cluster over the next few days so that the cluster will rebalance if a node goes down. You may find yourself logged out as we begin to apply the changes.

Paul · Feb 19, 2016

2016-02-18

We spent a good part of 02/17 and 02/18 updating all of our servers with a critical security patch. Google and Red Hat recently discovered a serious security vulnerability in a core Linux component found on most servers. Because our internal network uses secure caching DNS servers, we don't believe we were ever externally vulnerable to the exploit. However, due to the widespread impact of the vulnerability, we've updated all of our servers and rotated most of our API keys. All web servers were completely replaced with fresh servers running clean images.

We've had no indication that any server has been compromised. The exploit process is not subtle, and we would likely have been able to find traces of an attack, even if it was unsuccessful.

One server didn't properly receive updated API keys, so certain types of emails failed for a 2 hour duration on 02/18. These were primarily thread/forum/blog watch notifications. Aside from delayed or missing notification emails, the issue had no negative impact on our system.

If you run your own Linux server, be sure to update it. Usually this involves running a combination of yum or apt-get commands.

Paul · Mar 19, 2016

2016-03-19

We've spent much of the past month analyzing data that we've collected and preparing a variety of optimizations, several of which went into effect over the past 24 hours. Notably, we've significantly increased the maximum capacity of chat and added another database server. There are inevitably going to be minor bugs, so please let us know if you spot any.

Paul · Mar 21, 2016

2016-03-21

The new database server we added has been a bit temperamental, resulting in sporadic errors for certain pages/operations. We made several tweaks today to cut down on these errors. One of the tweaks was flawed, causing most form submissions to fail for a few minutes, though the site was still accessible. (Thanks to everyone who reported the issue!) The tweak was immediately fixed and re-deployed.

Special thanks to @Shane Bellone, who provided a lot of debug information. As a result of his efforts, we were able to fix a mysterious bug that had proven difficult to reproduce. Additionally, many thanks to all of you who have been sending error reports; they help us discover and monitor bugs more efficiently.

It's worth noting that the forum software we use was not designed for use with multiple database servers. We're venturing into uncharted territory, so there are bound to be occasional glitches no matter how much testing and preparation we do. While the experience of most of our members should be largely uninterrupted, we the appreciate the patience of anyone who runs into a problem or two.

Paul · May 6, 2016

2016-05-06

Our CDN, CloudFlare, added a new range of IP addresses. Some users in Thailand received 522 errors when attempting to access namepros.com while we tracked down the source of the problem. CloudFlare doesn't offer notifications of IP space changes, but we now have our own solution in place to monitor for such events.

A firewall update at around 5 AM UTC on 2016-05-05 deployed incorrectly and resulted in brief downtime. Users who attempted to access the website during that time saw a blank white page. Availability was inconsistent for no more than 10 minutes, with the website completely unavailable for about 4 minutes. This breaks the record for longest unplanned downtime with our current infrastructure. The firewall update was made in response to an attack that had occurred earlier; ironically, the attack did not affect availability.

Our monitors in Dublin, Ireland detected problems communicating with a CloudFlare PoP in that region early this morning, from about 3:19 to 5:04 UTC. Users in the area likely had trouble connecting to all sites using CloudFlare, including NamePros. Regional internet issues are common, but this was a particularly noticeable event that CloudFlare didn't report on their status page. (Lots of alarms went off on our end.)

Intercom, our customer support platform, was down from 14:22 to 15:02 UTC today. Support queries couldn't be sent during that time.

It's been a crazy week!

Paul · May 14, 2016

2016-05-14

From now on, we'll be posting status updates on our new status page. More info

NamePros Status Updates

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Tech, NamePros

Similar threads

We're social

Pinned

Appreciation

Agreement