Search index

 
itzmie
Benutzer
Avatar
Geschlecht: keine Angabe
Alter: 65
Beiträge: 40
Dabei seit: 05 / 2011
Betreff:

Search index

 · 
Gepostet: 21.12.2016 - 19:17 Uhr  ·  #1
Hi,

My search database is messed up. I can only see topics from one month ago.

Is it possible to reindex the search table and enable the forum in the mean time?
It takes more than 3 days to reindex it.
cback
Admin
Avatar
Geschlecht:
Herkunft: Saarland
Alter: 38
Homepage: cback.net
Beiträge: 17613
Dabei seit: 12 / 2003
Betreff:

Re: Search index

 · 
Gepostet: 21.12.2016 - 20:12 Uhr  ·  #2
Hi itzmie,

unfortunately it is generally not recommended to activate the forum during the Search Index Processing because there could be some inteferences when somebody posts something with words that are currently not indexed and two indexing processes try to create new words at the same time wich could cause some clutter in the index. This would generally not affect the function of the forum, but would maybe kick out some topics from your search index.

But when you say: You can only see topics from one month ago:
Did you drop the Search Index Database in between or do you confuse the Forum Search with the Unread Post Infos (Database Variant)? The UPI System in the Database has a 90 day-limit for new postings by default (you can change this in the ACP General Configuration). So if you are not logged in for more than 90 days you just see the newest unread 90 days topics.

The Search index normally shows all the results - the phenomenon you describe could only occur if you tried to purge / truncate the search_wordmatch or search_index table in between so only topics that were new from then on would be indexed correctly.



To improve the needed time for reindexing I would recommend something different:

Pull up the values for PHP Memory Limit and PHP Max_Execution_Time in your server PHP Configuration as far as you can.

For example 128M Memory Limit and 90 to 120secs max execution time.


Then open the file:

acp/classes/class_control.php


Find these two pieces of code:
Code

  public function sindex()
  {
    global $Smarty, $Core, $DB, $User, $lang;
    
    $step_count = 38;



AND

Code

  public function indexing()
  {
    global $Smarty, $DB, $User, $Core, $lang;
    
    $step_count = 38;





Change: $step_count = 38 to a much higher number. For example you can try it with 200 if you push up your Server Settings like that. Maybe even higher, its not problem to test out if the indexing process still runs. If it stops just change the value back a little and try again.


This would reduce the amount of time necessary for reindexing dramatically. The 38 is quite a small step suitable for most of the server configurations out there and even reliable if you have 38 very large postings with very much words in it.

But with enlarged values for memory and exec time you can go higher there without getting your script stopped by the PHP Interpreter.


Sincerely,
Chris
itzmie
Benutzer
Avatar
Geschlecht: keine Angabe
Alter: 65
Beiträge: 40
Dabei seit: 05 / 2011
Betreff:

Re: Search index

 · 
Gepostet: 22.12.2016 - 10:58 Uhr  ·  #3
Thanks Chris, it's still running after 12 hours with the stepcount set at 10000.

Is it possible to merge the two indexed databases together?
cback
Admin
Avatar
Geschlecht:
Herkunft: Saarland
Alter: 38
Homepage: cback.net
Beiträge: 17613
Dabei seit: 12 / 2003
Betreff:

Re: Search index

 · 
Gepostet: 22.12.2016 - 14:47 Uhr  ·  #4
Hi itzmie,

you're welcome!

12 hours with 10000 posts per step? Wow you must have a huge forum! :o

Unfortunately it is not possible to merge the Search Indexes because every generated index has different IDs for all the words.

And if you generate the Search Index offline and overwrite the whole Search Index DB in your productive forum every new post written between the local copy and the DB insertion would not be in the index (just the posts written after that).

But if you're at 12 hours now you could reactivate your forum if you have to, I think after 12 hours most of the possible words are already in the index, so there should be no problem with new posts written during the rest of the indexing process.


But beside that a little additional info: If you're not through all posts after 12h with 10000 posts per Step I think your amount of data slowly reaches the maximum capacity that is searchable performant with a PHP and MySQL based search solution. So if you have a final index it could be that the forum search will be more and more slow if your forum grows more and more. At some point there will be a case when you have to switch to an external search solution, for example one of the Java based Search Systems that run directly as a program on your server (for example Apache Solr). You have to write a Plugin then to replace the default PHP&MySQL Search of the CF3 with a query system for that external search provider (or Google Custom Search or something) wich is a complicated process. But that just as preventive information for the future.


Sincerely,
Chris
itzmie
Benutzer
Avatar
Geschlecht: keine Angabe
Alter: 65
Beiträge: 40
Dabei seit: 05 / 2011
Betreff:

Re: Search index

 · 
Gepostet: 22.12.2016 - 17:20 Uhr  ·  #5
Hi,

Alright, it took 17 hours to complete. We've got 1,2 milion posts. So it's a huge forum. :P
I guess we should rebuild the thing overnight (start at 10:00PM and enable the forum again at 08:00AM and keep it running).
cback
Admin
Avatar
Geschlecht:
Herkunft: Saarland
Alter: 38
Homepage: cback.net
Beiträge: 17613
Dabei seit: 12 / 2003
Betreff:

Re: Search index

 · 
Gepostet: 22.12.2016 - 19:19 Uhr  ·  #6
Hello itzmie,

this sounds like a good plan! :)

1,2Mio is huge but that's still tolerable if the index is rebuilt. The performance should be OK until up to 2Mio posts and depending on the server you have you should be able to go up to 3 Mio without noticeable performance breaks.


One last tip for possible optimization:
Before you rebuilt the search index you could completely delete the tables ..._search_index and ..._search_match from your forum (... = your Prefix).


Then recreate those tables again:
Code

CREATE TABLE `..._search_index` (
  `s_word_id` mediumint(8) UNSIGNED NOT NULL,
  `s_word_value` varchar(255) COLLATE utf8_bin NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

CREATE TABLE `..._search_match` (
  `m_word_id` mediumint(8) UNSIGNED NOT NULL,
  `m_post_id` mediumint(8) UNSIGNED NOT NULL,
  `m_in_title` tinyint(1) NOT NULL DEFAULT '0'
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;


ALTER TABLE `..._search_index`
  ADD PRIMARY KEY (`s_word_value`);

ALTER TABLE `..._search_match`
  ADD KEY `m_word_id` (`m_word_id`),
  ADD KEY `m_post_id` (`m_post_id`);



(Replace "..." with your Forum Prefix!)


This would also give you a completely fresh start for these tables.


Sincerely,
Chris
itzmie
Benutzer
Avatar
Geschlecht: keine Angabe
Alter: 65
Beiträge: 40
Dabei seit: 05 / 2011
Betreff:

Re: Search index

 · 
Gepostet: 23.12.2016 - 10:57 Uhr  ·  #7
Hi,

Alright, I've recreated the tables, so we start with a fresh database.
And I've enabled the forum now again at 80%. It's running now for 12 hours.
cback
Admin
Avatar
Geschlecht:
Herkunft: Saarland
Alter: 38
Homepage: cback.net
Beiträge: 17613
Dabei seit: 12 / 2003
Betreff:

Re: Search index

 · 
Gepostet: 23.12.2016 - 14:43 Uhr  ·  #8
Perfect! :)

I hope you won't have any more problems with the search index after that.

I wish you a merry christmas time!

Sincerely,
Chris
Gewählte Zitate für Mehrfachzitierung:   0

Registrierte in diesem Topic

Aktuell kein registrierter in diesem Bereich

Die Statistik zeigt, wer in den letzten 5 Minuten online war. Erneuerung alle 90 Sekunden.