How to backfill newznab safely without bloating your database

Update – August 2013
So quite a lot of you have followed this guide which is great, but it’s a bit out of date now. Theres been one major change, you now use the nzb-import.php script that’s already in admin. This itself has some nice features like being able to limit how many to import in a go. The new command is like:


php nzb-import.php /path/to/my/nzbs true 100 1000

Broken down this is:


php nzb-import.php [path] [use_filename_as_release_title(boolean)] [number_to_import] [max_post_age]

I’d recommend you add this to your screen script and have it loop 100 at a time or similar amounts.

Read the rest of the guide below.

Word.

Old Post:

Now NZBMatrix have shut there are a lot of people who want to start indexing themselves using newznab. Perfect!

I love newznab, it’s a great piece of kit. But it’s oh-so-easy to bloat your database when importing a lot of nzbs in one go.

If you follow this guide, you should be fine. There is a modified import script, which will iterate over 100 nzbs and then create the release, alongside your normal day to day activities so you don’t bloat. Win.

What I mean by bloat
If you just import all of the nzbs in one go, or too many, your machine wont be able to process all the millions and millions of parts. And you WILL have to start again.

Please note, this guide is as off the 10th December 2012 – things can change. If you run in to any issues, go to #newznab on synirc.

Step One
Acquire a big load of nzbs…that could be from a friend or torrent or anywhere.

Step Two
Extract your NZB files to a folder. Please be aware that the nzb import script is not recursive, so you may want to extract them all to one big folder. (Or keep changing the link in step four).

Step Three
If you go to your newznab installation folder, (mine for future reference is /var/www/newznab) and look in the misc/testing folder. You’ll see a script called nzb-importmodified.php. Copy that to /where/your/newznab/is/www/admin.


cp /where/your/newznab/is/misc/testing/nzb-importmodified.php /where/your/newznab/is/www/admin

Step Four
Go to your screen script (or batch script on windows, I’m going to write this as if it’s nix, but it should be obvious what you need to do to change this). So nano (or your choice of editor)


nano /where/your/newznab/is/misc/update_scripts/nix_scripts/newznab_screen.sh

Inside there, you should add a reference to your new import modified, with a link to where your nzbs are stored. Here is what my newznab_screen.sh looks like:
By adding true to the end of the request, it uses the name from the NZB file and doesn’t re-look it up.

It seems that the script to make code look pretty is adding some HTML below. Here is a plaintext version: http://www.tiag.me/newznab_screen.txt


#!/bin/sh
# call this script from within screen to get binaries, processes releases and
# every half day get tv/theatre info and optimise the database

set -e

export NEWZNAB_PATH="/var/www/newznab/misc/update_scripts"
export NEWZNAB_SLEEP_TIME="5" # in seconds
LASTOPTIMIZE=`date +%s`

while :

do
CURRTIME=`date +%s`
cd &#36&;{NEWZNAB_PATH}
php &#36&;{NEWZNAB_PATH}/update_binaries_threaded.php
php &#36&;{NEWZNAB_PATH}/update_releases.php
php /var/www/newznab/www/admin/nzb-importmodified.php /var/www/newznab/tempnzbs/ true
php &#36&;{NEWZNAB_PATH}/update_releases.php

DIFF=&#36&;((&#36&;CURRTIME-&#36&;LASTOPTIMIZE))
if [ "&#36&;DIFF" -gt 43200 ] || [ "&#36&;DIFF" -lt 1 ]
then
LASTOPTIMIZE=`date +%s`
php &#36&;{NEWZNAB_PATH}/optimise_db.php
php &#36&;{NEWZNAB_PATH}/update_tvschedule.php
php &#36&;{NEWZNAB_PATH}/update_theaters.php
fi

echo "waiting &#36&;{NEWZNAB_SLEEP_TIME} seconds..."
sleep &#36&;{NEWZNAB_SLEEP_TIME}

done

If you do this, your database shouldn’t bloat, and the world should be a happy place.

:-) – Hope that helps. Skinzy/Tom

Tags

Like this Article? Share it!

About the Author

Author Gravatar
Tom

Tom's a developer who has a love of all things technical. 7TB For the fileserver just isn't enough for him these days. You can usually find him buying things he doesn't need or watching Reading FC.

Related Posts

89 Comments

  1. Pingback: Basic Tutorial on How To Install NewzNab on Ubuntu 12.04 64 bit. - Freek.ws

  2. Thanks for the post, infromative…

    Offtopic: actually i do not understand these people who want to set their own indexing site. it will not be the same as a specialzed site, like NZBMatrix. I tried newznab out for more than 6 months, and now i can say, that maybe i will start my own site, because maybe i can run it without bigger problems. Newznab is not couchpotato or sickbeard, just install, some settings and enjoy. It is really a resource-intensive (broadband + hardware) program – just my oppinion.

    • For me it’s more about being able to control my own indexes. I’ve done reg exes before, not fun, but not impossible. I just want to know I can run my own index and rest assured my sickbeard and couchpotato will be working tomorrow… That’s assuming newsgroups don’t face pressure and become worthless.

  3. You can use it with CP / SB though

  4. Thanks Tom. But what do I do after I’ve followed your steps above?

    Thanks!

  5. Run the screen script and wait for it to import them, it can take ages :) it’ll do everything you need in newznab though

    • So, how did you decide upon 100 NZBs at a time? Would a higher number cause bloat? What about 1000 at a time? Does your answer change if the architecture is 64 bit linux or if the CPU was powerful?

      • If you’ve got the Disk I/O and the Ram, give it a go with a few more, when I actually did mine I did 500 at a time as I had hardware that could handle it

        • Thanks Tom for your comment and for the helpful article.
          Would you mind helping a bit to increase the amount of NZBs at a time? I understand that nzb-importmodified.php can be altered near the bottom (changing 100 to 500 or more). But what about the issue of update_releases.php?
          I think mine has been growing a large backlog os releases that need to be updated. Script has imported 500 NZBs, but only processes 100 of them at a time. I understand that I can continually update releases by hand, but where can I alter the update_releases file to ensure that all 500 NZBs are processed at the same time?

          Finally, just by taking a quick look at my hardware, would you mind giving a ballpark range of how many I should be importing at a time (100 / 500 / 1000)?

          Intel Q9550 (quad core)
          4 Gigs ram
          SSD (Intel 320 – No raid or anything special)
          Ubuntu 12.04 Server x64

          • Hey,

            You can change the amount it post-processes (that’s where it grabs information like images etc) in www/admin/lib/postprocess.php (or something similar (off the top of my head) – you can also split that out to run seperately if you want, but be carefull of api limits).

            To change how many the nzbimport does in a run, that’s in the importmodified.php. Hard to say how many your machine can cope with, try upping it a small amount at a time and see what happens

  6. Thanks for replying Tom.
    So to make clear I just run: /var/www/newznab/misc/update_scripts/nix_scripts/newznab_screen.sh ? That’s it?

  7. Yep – run it under screen and you’re sorted

  8. Alright thanks Tom.
    One more question, how does your ‘/var/www/newznab/tempnzbs’ look like? Mine has all lose folders e.g. TVAnime TVSport etc..

    • Also, how can I find out how much is left to be added from the backlog? Will the nzb files simply disappear from that temp folder and be moved to /var/www/newznab/nzbfiles?

      • It deletes the nzbs as it goes along. If you want to can count how many files are in the nzbimport folder (which could be a metric fucktonne..) ls | wc -l

        • Thanks. Could you please tell me if it’s okay to have all different subfolders in the /var/www/newznab/tempnzbs folder? Such as TVAnime etc?

          Tnx

  9. Halloween Jacqueline December 10, 2012 at 9:21 pm · Reply

    I found Newznab today! Got it installed. Very excited.

    Activated a couple of groups, update_binaries etc. But when updating releases nothing is processed. Nothing shows up in browse. Kinda lost now. HELP.

    Jacqueline….

  10. Have you got a newznab+ account Jacq?

  11. Is there a reason why you duplicate php $&;{NEWZNAB_PATH}/update_releases.php?

  12. MasterCrucuifier December 11, 2012 at 7:21 am · Reply

    what do I run for windows as I can’t run the .sh file?

  13. After the screen.sh script runs onece can the <> be edited out. And the NZB files that were importedbe deleted ? I was thinking after it was moved into newznab??

  14. I do not know what i do wrong, but i can run in terminal the update_binaries.php and after it the updtae_releases.php, the new releases are shown on the website, but i can not run the newznab_local.sh or the newznab_screen.sh in terminal. I got always: [screen is terminating]. any help, please?

  15. Tom thanks for putting this out there. Do you know of any techniques to speed-up this process? From the nzbs I selected from dump I got, I have ~416,000 nzbs to process. The importer script works on only 100 at a time. Ignoring all of the other normal header importing and processing steps of the screen script, my calculations show that the nzb-importmodified.php execution time (for 100 nzbs) takes about 1 minute and the following update_releases.php is taking around 10 minutes.

    11 minutes per 100 nzbs equates to ~3,200 days to process based on the following calculation: ((total nzbs * 11 minutes)/60 minutes)/24 hours = # of days to process. ((416665*11)/60)/24 = 3,182 days.

    This can’t be right, can it?

    • Haha, you can experiment with upping the amount in the nzb-importmodified script based on your hardware. The main goal is to stop the db getting huge and not being able to process it. I’ve seen many people starting again because of that. It’s a slow process (but not 3200 days slow!)

    • Try reducing the number of files in the folder to 100,000 or less. The amount of time it is taking to enumerate the files in the dump directory was adding several minutes to each round when I had 300k.

    • You forgot to divide by 100. In your calc, you allowed 11 min per NZB instead of 11 min per 100 NZBs. So time required is about 32 days.

  16. Any hints on Step #1??? Grabbing a giant batch of NZBs would sure be handy for us newbie NewzNabbers

    How about an NZB OF NZBs. How cool would that be…..

  17. Pingback: NZBMatrix is ****ing gone. by RamataKahn - Page 7 - TribalWar Forums

  18. Have you got a windows example file?

  19. php.exe nzb-importmodified.php C:/inetpub/wwwroot/nnplus/tempnzbs/ true
    php.exe update_releases.php

  20. Thanks! Works like a charm :)

  21. Can you explain what you meant up there by it’s not really worth it without newznab+? I don’t care about importing a bunch of nzb’s right now, I just want it to pick up some stuff going forward and see how it works.

  22. Hi, is there anyway to automate this so script pauses at 100 count then updates releases and resume import etc.

  23. I suppose my question is, how much backfill is too much :) If I were to try 360 days, would I be able to get away with not running this, and just using backfill_threaded instead? Maybe do a group at a time? I’m only indexing about 18 if that helps…

    • It’s quicker to do a backfill, if you get the torrents, they’re split out in to categories. If you’re happy with the backfill and not the import. Probably not worth the hassle

  24. Awesome guide Tom, much better than the normal way. Reckon you could put together some more of these for some of the other testing files?

    I can see myself having a very customised screen script here. OptimiseDb etc etc

  25. I keep getting “TheTVDB : ShowX 01×01 Not found” for every batch of nzbs I import, anyone else? i can’t see any config for TheTVDB?

  26. Newznab is way to resource expensive and slow, I use Spotweb instead.

  27. So, just to be sure.

    I have 5000 NZB’s in tempnzb, when i run the screen script once all of them get imported?

    And a real n00b question, how do call the screen script?
    :$

    • Yep, just keep running it (you’ll have to do the changes above though) – Just install screen on the server and use the command “screen” to start screen. Then do cd my/newznab/misc/update_scripts/nix_scripts and do ./newznab_screen. Then you can quit terminal whenever you want. When you reconnect do screen -Dr to resume your session

  28. I did everything it said and i have it set up but i am having no luck i imported one file just to test it and after i import it through my newznab site and run update_releases it doesnt show up what am i doing wrong?

  29. Yeah that’s the really gold, anyone happen to have a large nzb of nzbs, SOMEONE has to have a nzbmatrix dump lol.

  30. I’ve been messing with this since nzbmatrix went south. The issue I’m getting is that Sickbeard intermittently finds files in newznab – always seems to happen after I backfill more groups. Could the database bloating be a cause of this?

  31. Hi Tom,

    Many thanks for this great guide. I am about to start backfilling. There is one error in your guide that caused me great aggravation…. On the text copy you the screen script you have what copy/paste as two separate lines:

    php /var/www/newznab/www/admin/nzb-importmodified.php
    /var/www/newznab/tempnzbs/ true

    When in fact they should be on a single line. This may save some of the less-unix experienced users some head-scratching.

    Many thanks again!

  32. Any reason the nzb’s in my tmp folder aren’t getting deleted even though the script is running and releases are being added to the index?

  33. Pingback: Newznab – Adentures in indexing | Views on Life

  34. Hi,

    Using newznab and not really sure what to add to the runme script to make it import nzbs I acquired. Been looking around for hours and tried a few different things. This is my current runme http://pastebin.com/BbNe7Fwh

    My nzbs are in a folder on a shared drive on the vm with the path of \\Vboxsvr\vm\newznab\TV\TVHD\To import

    So would the script to add to runme be
    C:\xampp\php\php.exe nzb-importmodified.php \\Vboxsvr\vm\newznab\TV\TVHD\true
    C:\xampp\php\php.exe update_releases.php

    I am not really sure any help much appreciated tom

  35. I’m having troubles converting these directions to the windows scripts. I found the nzb-importmodified.php file but I am lost from that point on. Help?

    • Hi Chris,

      What stage do you get to? All you really have to change is use runme.bat in win_scripts instead of nix_scripts and the .sh script.

  36. Excellent tips here. I have it up and running and it is indeed creating releases, but it doesn’t seem to send them to my Sabnzdb installation. I will check a release and tell it to send it to sab, but nothing on the other side. I have configured, what I think, are the correct options under the Edit Site and included my nzb api etc, but nothing seems to happen. Any tips?

  37. Call me a bit retarded, but for the life of me I cannot seem to find any decent torrent file with NZBs for importing, any help :)

  38. i just added this to my screen script and im getting an error:

    no arguments specified – php nzb-import.php /path/to/nzb bool_use_filenames
    ./newznab_local.sh: 19: ./newznab_local.sh: /var/www/newznab/tempnzb/: Permission denied
    root@ubuntu:/var/www/newznab/misc/update_scripts/nix_scripts#

    not too sure whats up

    thnx

  39. Tom,

    I have followed you guide and the items don’t seem to be getting imported. I know inside of newznab import section it say the groups in question need to be added (not necessarily enabled). Is that the case with script as well?

  40. Hi, how can i change from 100 to 1000 files?
    Do i have to modify the nzb-imported.php ? i saw something called nzbcount = 100 do i change this for it to work?

  41. When I try to run this, I get:

    Status: 302 Moved Temporarily
    Set-Cookie: PHPSESSID=rf92f98r0316g4l255lur5ljj1; path=/
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
    Pragma: no-cache
    Location: /login?redirect=
    Content-type: text/html

    I haven’t been able to figure out why, but it looks like maybe it’s trying to force me to log in?

  42. Tom,
    A few posts up a user stated “Any reason the nzb’s in my tmp folder aren’t getting deleted even though the script is running and releases are being added to the index?”

    I have the same issue 21,150 in the tempnzbs folder and I watch them get imported but they are not being deleted. Ubuntu 12 777 on the folder

  43. I am getting this error:
    no arguments specified – php nzb-import.php /path/to/nzb bool_use_filenames
    ./newznab_screen_local.sh: 20: ./newznab_screen_local.sh: /var/www/newznab/tempnzbs/MoviesSD/: Permission denied

    I have checked permissions along the line and cant find an issue. Any ideas?

    • I’m getting this too. I’ve made absolutely sure the directory is 777, I’m attempting to run newznab_screen.sh as root, and I still get this error. I can’t see how it doesn’t have permissions… This is ubuntu 12.04.

      Any help please?

      • I’ve worked this out by actually comprehending the scripts instead of blindly copy and pasting. Shamefully I might add, as an IT worker… I just didn’t want to “work” haha. Anyway, in the script example it shows this:

        php /var/www/newznab/www/admin/nzb-importmodified.php
        /var/www/newznab/tempnzbs/ true

        That’s two lines there. It’s actually supposed to be ONE line, like this:

        php /var/www/newznab/www/admin/nzb-importmodified.php /var/www/newznab/tempnzbs/ true

        Since there is no preview I can’t confirm that example shows as one line, but anyway, just make sure that its “php [location of nzb-importmodified.php] [location of import nzb directory] [true].

  44. Can some one please provide a script and instruction for windows?

  45. Hi, am i right in thinking that imported nzb’s using the true option cannot be discovered when using couchpotato or sickbeard?

    all of the imports are showing (0) as the regex, all of the imports have been catagorised correctly and can be downloaded manually successfully.

    my sickbeard can discover and pull all releases indexed in the traditional manner, am i missing a step?

    cheers

    Matt

  46. Any chance someone could up a “big load of nzbs” so we can get started?

  47. Can some one please provide a script and instruction for windows pretty plz :)

  48. Tom, thanks for the article on importing. Thanks to your approach, I’ve imported probably 1.5 million files. Now I’m running into a scalability issue with the “nzbfiles” dir. All those processed gzipped nzbs in that dir are only divided into A-Z and 0-9 based on the first letter in their filename. 1.5M/36 subdirs is still a huge number of files per dir and my backup tools are having a tough time enumerating the dirs during backup. A simple “ls” can take minutes. Any suggestions?

  49. Hi, if i run this script will I go over the 2000 api hits per hour they allow on Amazon? The update releases is doing 1000 audio lookups, 100 books and 100 consoles per run and this script sets it to run twice

  50. HI there,

    I just installed a new copy of Newznab+ and the nzb-importmodified.php is no longer present in /misc/testing

    Is this modified version no longer needed? Can I just use the ‘normal’ nzb-import.php found in /www/admin or will this import more than 100 nzbs at a time?

    Thanks, Alex.

    • +1 to Alex. I also can’t find the nzb-importmodified.php in /misc/testing.

      Any ideas what happened and how we should proceed?

  51. Hi, Trying to get this set up on Windows but have got an issue that I hope somebody can help with. I don’t know php, but I think this is what I should be adding to my runme.bat:

    C:\xampp\htdocs\newznab\www\admin\nzb-importmodified.php C:\xampp\htdocs\newznab\tempnzb\ true

    Whenever I try to run it I get the following error: no arguments specified – php nzb-import.php /path/to/nzb bool_use_filenames

    I’ve tried everything I can think of, and can’t find an answer elsewhere, every page I find is just a link to this one. Is there some error in my syntax?

  52. I don’t see the nzb-importedmodified.php in my setup. I’ve searched everywhere. It’s not in the testing folder. Any ideas? (I have nz+ btw.)

    Also how can i tell if my database is bloated? I feel i may have done this as I backfilled a group as far back as it would go without running any of the update scripts. If it is bloated is there anything i can do or do i have to start from scratch?

  53. Correct me if I am wrong, but I think nzb-importmodified.php is an old script. With the new nzb-import.php you can specify number of NZBs to import per run.

Leave a Comment