For all of you that\'s wondering what\'s going on with the server here\'s an example of what we are trying to deal with. After a crash a core dump is generated. This is a large file ( 500+MB ) that contains what the program was doing up to that point. So after every crash we can examine that and try to figure out what went wrong. Think of it as a black box like on an airplane.
This core file contains a back trace which is how the programming was flowing before the crash and looks something like this:
0 0x08255a55 in psCharacter::CheckQuestCompleted(psQuest*) (this=0xabb0228,
quest=0x8674770) at src/server/bulkobjects/pscharacter.cpp:2820
2820 if (IsQuestAssigned(quest->GetID())->status == \'C\')
(gdb) bt
#0 0x08255a55 in psCharacter::CheckQuestCompleted(psQuest*) (this=0xabb0228,
quest=0x8674770) at src/server/bulkobjects/pscharacter.cpp:2820
#1 0x0823127a in VerifyQuestCompletedResponseOp::Run(gemNPC*, Client*, NpcResponse*) (this=0x8676b90, who=0xaecaf48, target=0x11348488, owner=0x86763e8)
at src/server/bulkobjects/dictionary.cpp:957
#2 0x08230f92 in NpcResponse::ExecuteScript(Client*, gemNPC*) (
this=0x86763e8, client=0x11348488, target=0xaecaf48)
at src/server/bulkobjects/dictionary.cpp:902
#3 0x081b2a89 in PlayerToNPCExchange::HandleAccept(Client*) (this=0x13774c30,
client=0x11348488) at src/server/exchangemanager.cpp:1200
#4 0x081b3926 in ExchangeManager::HandleMessage(MsgEntry*, Client*) (
this=0x8488c30, me=0xc438a78, client=0x11348488)
at src/server/exchangemanager.cpp:1369
#5 0x082fc3e9 in MsgHandler::Publish(MsgEntry*) (this=0x8529cb8, me=0xc438a78)
at src/common/net/msghandler.cpp:97
#6 0x0830c3f6 in EventManager::Run() (this=0x8529cb8)
at src/common/util/eventmanager.cpp:154
#7 0x08282b4c in csPosixThread::ThreadRun(void*) (param=0x84a3980)
at libs/csutil/generic/cspthrd.cpp:479
#8 0x40021941 in pthread_start_thread () from /lib/i686/libpthread.so.0
Most crashes are a result of null pointer errors. So in this case the line:
if (IsQuestAssigned(quest->GetID())->status == \'C\')
has 2 potential pointers to check, quest and the result from IsQuestAssigned()
(gdb) print quest
$1 = (psQuest *) 0x8674770
Seems ok.
(gdb) print quest->id
$2 = 13479
Seems ok
So lets look at the code a bit here now:
bool psCharacter::CheckQuestCompleted(psQuest *quest)
{
if (IsQuestAssigned(quest->GetID())->status == \'C\')
return true;
else return false;
}
So now we should check IsQuestAssigned to see what that is doing:
QuestAssignment *psCharacter::IsQuestAssigned(int id)
{
for (size_t i=0; i {
if (assigned_quests[i]->quest->GetID() == id &&
assigned_quests[i]->status != \'D\')
return assigned_quests[i];
}
return NULL;
}
So this returns NULL if the quest is not found in the assigned quests list.
So lets take a look at that:
(gdb) p assigned_quests
$3 = {count = 5, capacity = 16, threshold = 16, root = 0x12204fa0}
Ok there is only 5 elements in there. Lets take a look and see if we
can find the one with our id in it.
(gdb) p assigned_quests.root[0]->quest->id
$4 = 11
(gdb) p assigned_quests.root[1]->quest->id
$5 = 15
(gdb) p assigned_quests.root[2]->quest->id
$6 = 19
(gdb) p assigned_quests.root[3]->quest->id
$7 = 20
(gdb) p assigned_quests.root[4]->quest->id
$8 = 23
Aha so we can see that our quest we are looking for is not found so
IsQuestAssigned(quest->GetID() is returning NULL therefore when we try
to use ->status on this it falls down and goes boom.
Solution?
Fix the
bool psCharacter::CheckQuestCompleted(psQuest *quest) function to check
for null pointer first.
QuestAssignment* assignment = IsQuestAssigned(quest->GetID());
if (assignment && assignment->status == \'C\')
...
Now this will make sure assignment exists first before trying to use ->status on it.
So that bug should be fixed ( I hope ) and I can delete that core file and move on to the next one.